Automated requirements classification using feature selection based on linguistic features

dc.contributorGraduate Program in Systems and Control Engineering.
dc.contributor.advisorAydemir, Fatma Başak.
dc.contributor.authorÇevikol, Sercan.
dc.date.accessioned2023-03-16T11:35:00Z
dc.date.available2023-03-16T11:35:00Z
dc.date.issued2021.
dc.description.abstractRequirements classification is an important problem in organizing the systems and requirements, and it is widely used in handling large requirements data sets. A basic example of a requirements classification problem is the distinction between the functional and non-functional (quality) requirements. The state-of-the-art classifiers are most effective when they use a large set of word features such as text n-grams or part of speech n-grams. However, as the number of features increases, it becomes more difficult to interpret the approach, because many redundant features have to be explored that do not capture the meaning of the requirements. In this study, we propose the use of more general linguistic features, such as dependency types, for the construction of interpretable machine learning classifiers for requirements engineering. Through a feature engineering effort, assisted by tools that interpret graphically how classifiers work, we derive a set of linguistic features. While classifiers that use the proposed features fit the training set slightly worse than those that use high-dimensional feature sets, this approach performs generally better on validation data sets and is more interpretable. We use industry data sets, and we perform experimental runs using several automated feature selection algorithms to explore whether our feature set can be optimized further using one of the automated selection algorithms. Although in some data sets, impressive results were obtained. the automated selection algorithms did not prove a significant improvement, and even, on average, the results were worse than the results we obtained using the set based on linguistic features.
dc.format.extent30 cm.
dc.format.pagesxi, 63 leaves ;
dc.identifier.otherSCO 2021 C48
dc.identifier.urihttps://digitalarchive.library.bogazici.edu.tr/handle/123456789/15681
dc.publisherThesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2021.
dc.subject.lcshSoftware engineering.
dc.subject.lcshLinguistics -- Software.
dc.titleAutomated requirements classification using feature selection based on linguistic features

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
b2765748.036897.001.PDF
Size:
1.58 MB
Format:
Adobe Portable Document Format

Collections