From unrestricted natural language requirements to domain models
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023.
Abstract
Domain models are used to establish general overview of a software system to ease the communication between the project stakeholders and as various inputs for other software development activities. Due to these benefits, domain model extraction is an important task for both researchers and practitioners of software projects. Domain model extraction process bears challenges such as being labour intensive, requiring extensive communication which is not always possible in real-world projects, and coverage completeness being hard to attain. For these reasons, researchers propose methods to ease and aid the domain extraction process using natural language processing methods. In this study, we propose a fully automated approach to extract domain models from unstructured natural language requirements which combines capabilities of modern language models, a state-of-the-art term ranking algorithm, and a rule based extraction module. We evaluate our proposal with both industrial and educational data sets and perform a quantitative evaluation. The state of the art overperform our approach in the relation detection performance and overall precision of the pipeline. In terms of domain concept coverage and individual concept detection we achieve on par or better overall performance compared to state-of-the-art methods. Our approach perform better in data sets from the industry compared to the students’ data sets.
