Using transformer networks for detection and normalization of named entities in biomedical texts

dc.contributorGraduate Program in Computer Engineering.
dc.contributor.advisorÖzgür, Arzucan.
dc.contributor.authorPala, İlkay Ramazan.
dc.date.accessioned2023-03-16T10:05:36Z
dc.date.available2023-03-16T10:05:36Z
dc.date.issued2021.
dc.description.abstractThe increasing difficulty of retrieving relevant information from rapidly growing literature has raised the interest for natural language processing (NLP) systems in the biomedical domain. In many of these systems, detection of named entities such as diseases, genes, and molecules (named entity recognition) and matching them to the corresponding entries in ontologies (normalization) are important intermediate steps. As these two tasks are related and datasets in this domain are relatively small, multi task learning has been frequently used in the literature for this problem. Meanwhile, in recent years, the success of transformer-based pre-trained language models such as BERT in various NLP tasks has led them to be also applied in the biomedical domain. The different characteristics of biomedical text such as abbreviations and specific terminology motivated the development of new language models, which were trained specifically for this domain using a biomedical corpus. In this study, we propose a multi-task learning approach for named entity recognition and normalization by utilizing transformer-based pre-trained language models. To enable the optimal sharing of information, both tasks are formulated with text span embeddings obtained with a common encoder network. Promising results are obtained and compared with the results of state-of-the-art systems from the literature for commonly used named entity recognition datasets.
dc.format.extent30 cm.
dc.format.pagesxi, 57 leaves ;
dc.identifier.otherCMPE 2021 P36
dc.identifier.urihttps://digitalarchive.library.bogazici.edu.tr/handle/123456789/12463
dc.publisherThesis (M.S.) - Bogazici University. Institute for Graduate Studies in Science and Engineering, 2021.
dc.subject.lcshNatural language processing (Computer science)
dc.subject.lcshText processing (Computer science)
dc.subject.lcshComputational linguistics.
dc.titleUsing transformer networks for detection and normalization of named entities in biomedical texts

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
b2765857.036916.001.PDF
Size:
422.2 KB
Format:
Adobe Portable Document Format

Collections