New Approaches to Named Entity Recognition in Turkish: Improving Performance through Architecture Modification of Causal Large Language Models T urkc e Varlik Adi Tanimada Yeni Yaklas imlar: Nedensel B uy uk Dil Modellerinde Mimari Degis ikligi ile Bas arim Artirilmasi

Tuncer Y., Pekedis H., Ozeren H., GANİZ M. C.

33rd IEEE Conference on Signal Processing and Communications Applications, SIU 2025, İstanbul, Türkiye, 25 - 28 Haziran 2025, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Doi Numarası: 10.1109/siu66497.2025.11112472
Basıldığı Şehir: İstanbul
Basıldığı Ülke: Türkiye
Anahtar Kelimeler: Large Language Models (LLMs), Named Entity Recognition (NER), Turkish NLP
Marmara Üniversitesi Adresli: Evet

Özet

This study investigates how the performance of Named Entity Recognition (NER) in Turkish documents can be improved through supervised fine-tuning of modern text-generating Large Language Models (LLMs). Encoder-based models such as Turkish BERT, multilingual mBART, mT5, and the Turkish-specific TURNA are compared with the decoder-only LLaMA model. Leveraging a recent study, the LLaMA model was restructured to support bidirectional attention in order to overcome the limited context processing capacity of its unidirectional architecture. This modification led to a significant improvement in NER performance, especially in terms of macro F1 score. The results indicate that converting LLMs like LLaMA into bidirectional token classifiers can yield substantial performance gains in Turkish texts. Furthermore, the findings suggest that larger and more powerful decoder-based models can be effectively and efficiently used for Turkish token and sequence classification tasks.