A phoneme-based approach for eliminating out-of-vocabulary problem of Turkish speech recognition using Hidden Markov Model


Yavuz E., TOPUZ V.

COMPUTER SYSTEMS SCIENCE AND ENGINEERING, cilt.33, ss.429-445, 2018 (SCI İndekslerine Giren Dergi) identifier

  • Cilt numarası: 33 Konu: 6
  • Basım Tarihi: 2018
  • Dergi Adı: COMPUTER SYSTEMS SCIENCE AND ENGINEERING
  • Sayfa Sayısı: ss.429-445

Özet

Since Turkish is a morphologically productive language, it is almost impossible for a word-based recognition system to be realized to completely model Turkish language. Due to the fact that it is difficult for the system to recognize words not introduced to it in a word-based recognition system, recognition success rate drops considerably caused by out-of-vocabulary words. In this study, a speaker-dependent, phoneme-based word recognition system has been designed and implemented for Turkish Language to overcome the problem. An algorithm for finding phoneme-boundaries has been devised in order to segment the word into its phonemes. After the segmentation of words into phonemes, each phoneme is separated into different sub-groups according to its position and neighboring phonemes in that word. Generated sub-groups are represented by Hidden Markov Model, which is a statistical technique, using Mel-frequency cepstral coefficients as feature vector. Since phoneme-based approach is adopted in this study, it has been successfully achieved that many out of vocabulary words could be recognized.