Biomedical Named Entity Recognition Using Transformers with biLSTM + CRF and Graph Convolutional Neural Networks

Celikmasat G., Enes Akturk M., Emre Ertunc Y., Majeed Issifu A., GANİZ M. C.

16th International Conference on INnovations in Intelligent SysTems and Applications, INISTA 2022, Biarritz, Fransa, 8 - 12 Ağustos 2022

Yayın Türü: Bildiri / Tam Metin Bildiri
Doi Numarası: 10.1109/inista55318.2022.9894270
Basıldığı Şehir: Biarritz
Basıldığı Ülke: Fransa
Anahtar Kelimeler: Biomedical, CRF, GCN, LSTM, Named Entity Recognition, Natural Language Processing
Marmara Üniversitesi Adresli: Evet

Özet

© 2022 IEEE.One of the applications of Natural Language Processing (NLP) is to process free text data for extracting information. Information extraction has various forms like Named Entity Recognition (NER) for detecting the named entities in the free text. Biomedical named-entity extraction task is about extracting named entities like drugs, diseases, organs, etc. from texts in medical domain. In our study, we improve commonly used models in this domain, such as biLSTM+CRF model, using transformer based language models like BERT and its domain-specific variant BioBERT in the embedding layer. We conduct several experiments on several different benchmark biomedical datasets using a variety of combination of models and embeddings such as BioBERT+biLSTM+CRF, BERT+biLSTM+CRF, Fasttext+biLSTM+CRF, and Graph Convolutional Networks. Our results show a quite visible, 4% to 13%, improvements when baseline biLSTM+CRF model is initialized with pretrained language models such as BERT and especially with domain specific one like BioBERT on several datasets.