Multimodal emotion recognition based on peak frame selection from video

Zhalehpour, Sara; Akhtar, Zahid; Erdem, ÇİĞDEM

doi:10.1007/s11760-015-0822-0

Multimodal emotion recognition based on peak frame selection from video

Atıf İçin Kopyala

Zhalehpour S., Akhtar Z., Erdem Ç.

SIGNAL IMAGE AND VIDEO PROCESSING, cilt.10, sa.5, ss.827-834, 2016 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 10 Sayı: 5
Basım Tarihi: 2016
Doi Numarası: 10.1007/s11760-015-0822-0
Dergi Adı: SIGNAL IMAGE AND VIDEO PROCESSING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.827-834
Anahtar Kelimeler: Affective computing, Facial expression recognition, Apex frame, Audio-visual emotion recognition, FUSION
Marmara Üniversitesi Adresli: Evet

Özet

We present a fully automatic multimodal emotion recognition system based on three novel peak frame selection approaches using the video channel. Selection of peak frames (i.e., apex frames) is an important preprocessing step for facial expression recognition as they contain the most relevant information for classification. Two of the three proposed peak frame selection methods (i.e., MAXDIST and DEND-CLUSTER) do not employ any training or prior learning. The third method proposed for peak frame selection (i.e., EIFS) is based on measuring the "distance" of the expressive face from the subspace of neutral facial expression, which requires a prior learning step to model the subspace of neutral face shapes. The audio and video modalities are fused at the decision level. The subject-independent audio-visual emotion recognition system has shown promising results on two databases in two different languages (eNTERFACE and BAUM-1a).