Automatic question generation for improving low resource question answering performance
| dc.contributor | Graduate program in Computer Engineering. | |
| dc.contributor.advisor | Özgür, Arzucan. | |
| dc.contributor.advisor | Arısoy Saraçlar, Ebru. | |
| dc.contributor.author | Manav, Yusufcan. | |
| dc.date.accessioned | 2025-04-14T12:09:52Z | |
| dc.date.available | 2025-04-14T12:09:52Z | |
| dc.date.issued | 2023 | |
| dc.description.abstract | This thesis focuses on employing a question-generation system to improve the performance of question-answering models. We propose a multitask-trained questiongeneration module that is built on a multilingual encoder-decoder architecture and can produce question-answer pairs over plain text passages. We were able to adapt the question-generation system to several languages by using a multilingual model. First, we created a Turkish Question Answering dataset utilizing the Turkish Wikipedia pages and this question-generation system. Our experiments revealed that the performance on the Turkish XQuAD set was enhanced by 3% when the generated dataset was combined with the human-annotated dataset for question-answering model training. Second we also extensively test our model in many languages and low-resource environments. We used limited annotated data from the question-answering datasets from different languages like English, German, French, and Turkish; to train the question generation model. We then utilized this model to create artificial question-answer pairs from the unannotated paragraphs. Our experiments revealed that, especially in the lower data settings, our augmentation strategy consistently outperformed the baseline question- answering models that are trained on human-annotated data across a range of dataset sizes and languages. | |
| dc.format.pages | xiii, 76 leaves | |
| dc.identifier.other | Graduate program in Computer Engineering. TKL 2023 U68 PhD (Thes TKL 2023 Z37 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14908/21500 | |
| dc.publisher | Thesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023. | |
| dc.subject.lcsh | Question-answering systems. | |
| dc.subject.lcsh | Natural language processing (Computer science) | |
| dc.title | Automatic question generation for improving low resource question answering performance |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- b2795770.038438.001.PDF
- Size:
- 2.01 MB
- Format:
- Adobe Portable Document Format