Repository logo

Automatic question generation for improving low resource question answering performance

dc.contributorGraduate program in Computer Engineering.
dc.contributor.advisorÖzgür, Arzucan.
dc.contributor.advisorArısoy Saraçlar, Ebru.
dc.contributor.authorManav, Yusufcan.
dc.date.accessioned2025-04-14T12:09:52Z
dc.date.available2025-04-14T12:09:52Z
dc.date.issued2023
dc.description.abstractThis thesis focuses on employing a question-generation system to improve the performance of question-answering models. We propose a multitask-trained questiongeneration module that is built on a multilingual encoder-decoder architecture and can produce question-answer pairs over plain text passages. We were able to adapt the question-generation system to several languages by using a multilingual model. First, we created a Turkish Question Answering dataset utilizing the Turkish Wikipedia pages and this question-generation system. Our experiments revealed that the performance on the Turkish XQuAD set was enhanced by 3% when the generated dataset was combined with the human-annotated dataset for question-answering model training. Second we also extensively test our model in many languages and low-resource environments. We used limited annotated data from the question-answering datasets from different languages like English, German, French, and Turkish; to train the question generation model. We then utilized this model to create artificial question-answer pairs from the unannotated paragraphs. Our experiments revealed that, especially in the lower data settings, our augmentation strategy consistently outperformed the baseline question- answering models that are trained on human-annotated data across a range of dataset sizes and languages.
dc.format.pagesxiii, 76 leaves
dc.identifier.otherGraduate program in Computer Engineering. TKL 2023 U68 PhD (Thes TKL 2023 Z37
dc.identifier.urihttps://hdl.handle.net/20.500.14908/21500
dc.publisherThesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2023.
dc.subject.lcshQuestion-answering systems.
dc.subject.lcshNatural language processing (Computer science)
dc.titleAutomatic question generation for improving low resource question answering performance

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
b2795770.038438.001.PDF
Size:
2.01 MB
Format:
Adobe Portable Document Format

Collections