Angeliki Metallinou, Carlos Busso, Sungbok Lee, Shrikanth Narayanan: “Visual Emotion Recognition Using Compact Facial Representations and Viseme Information”

March 14, 2010 | Dallas, TX

Speaker: Angeliki Metallinou, Carlos Busso, Sungbok Lee, Shrikanth Narayanan
Host: 2010 IEEE International Conference on Acoustics, Speech and Signal Processing

Emotion expression is an essential part of human interaction. Rich emotional information is conveyed through the human face. In this study, we analyze detailed motion-captured facial information of ten speakers of both genders during emotional speech. We derive compact facial representations using methods motivated by Principal Component Analysis and speaker face normalization. Moreover, we model emotional facial movements by conditioning on knowledge of speech-related movements (articulation). We achieve average classification accuracies on the order of 75% for happiness, 50-60% for anger and sadness and 35% for neutrality in speaker independent experiments. We also found that dynamic modeling and the use of viseme information improves recognition accuracy for anger, happiness and sadness, as well as for the overall unweighted performance. Index Terms: Emotion recognition, Principal Component Analysis, Principal Feature Analysis, Fisher Criterion, visemes