Repository logo

Solving Turkish math word problems by sequence-to-sequence encoder-decoder models

dc.contributorGraduate Program in Computer Engineering.
dc.contributor.advisorGüngör, Tunga.
dc.contributor.authorGedik, Esin.
dc.date.accessioned2023-10-15T06:54:28Z
dc.date.available2023-10-15T06:54:28Z
dc.date.issued2022
dc.description.abstractIt can be argued that solving math word problems (MWP) is a challenging task due to the semantic gap between natural language texts and mathematical equations. The main purpose of the task is to take a written math problem as input and produce a proper equation as output for solving that problem. This thesis describes a sequence to-sequence (seq2seq) neural model for automatically solving MWPs based on their semantic meanings in the text. The seq2seq model has the advantage of being able to generate equations that do not exist in the training data. It comprises a bidirec tional encoder to encode the input sequence and comprehend the problem semantics, and a decoder with attention to track semantic meanings of the output symbols and extract the equation. In this thesis, we investigate the successes of several pre-trained language models and neural models, including gated recurrent units (GRU) and long short- term memory (LSTM) seq2seq models. Our research is novel in the sense that there exist no studies in Turkish on this natural language processing (NLP) task that utilize the pre-trained language models and neural models. There is also no Turkish dataset designed to implement the neural models for MWP task. Due to the lack of data, we translated the well-known English MWP datasets into Turkish using a ma chine translation system. We performed manual adjustments, and built the corpora to contribute to the literature. Although Turkish is an agglutinative and grammatically challenging language to work on, our system correctly answers 71% of the questions in the corpora.
dc.format.pagesxiv, 71 leaves
dc.identifier.otherCMPE 2022 G44
dc.identifier.urihttps://hdl.handle.net/20.500.14908/19709
dc.publisherThesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2022.
dc.subject.lcshMathematics -- Problems, exercises, etc.
dc.subject.lcshWord problems (Mathematics)
dc.titleSolving Turkish math word problems by sequence-to-sequence encoder-decoder models

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
b2778269.037637.001.PDF
Size:
491.92 KB
Format:
Adobe Portable Document Format

Collections