Selection of Emotionally Salient Audio-Visual Features for Modeling Human Evaluations of Synthetic Character Emotion Displays (bibtex)
by Mower, Emily, Mataric, Maja J. and Narayanan, Shrikanth
Abstract:
Computer simulated avatars and humanoid robots have an increasingly prominent place in today's world. Accep- tance of these synthetic characters depends on their ability to properly and recognizably convey basic emotion states to a user population. This study presents an analysis of audio- visual features that can be used to predict user evaluations of synthetic character emotion displays. These features in- clude prosodic, spectral, and semantic properties of audio signals in addition to FACS-inspired video features [11]. The goal of this paper is to identify the audio-visual fea- tures that explain the variance in the emotional evaluations of na ̈ıve listeners through the utilization of information gain feature selection in conjunction with support vector ma- chines. These results suggest that there exists an emotion- ally salient subset of the audio-visual feature space. The features that contribute most to the explanation of evalua- tor variance are the prior knowledge audio statistics (e.g., average valence rating), the high energy band spectral com- ponents, and the quartile pitch range. This feature subset should be correctly modeled and implemented in the design of synthetic expressive displays to convey the desired emo- tions.
Reference:
Selection of Emotionally Salient Audio-Visual Features for Modeling Human Evaluations of Synthetic Character Emotion Displays (Mower, Emily, Mataric, Maja J. and Narayanan, Shrikanth), In Proceedings of the IEEE International Symposium on Multimedia, 2008.
Bibtex Entry:
@inproceedings{mower_selection_2008,
	address = {Berkeley, CA},
	title = {Selection of {Emotionally} {Salient} {Audio}-{Visual} {Features} for {Modeling} {Human} {Evaluations} of {Synthetic} {Character} {Emotion} {Displays}},
	url = {http://ict.usc.edu/pubs/Selection%20of%20Emotionally%20Salient%20Audio-Visual%20Features%20for%20Modeling%20Human%20Evaluations%20of%20Synthetic%20Character%20Emotion%20Displays.pdf},
	abstract = {Computer simulated avatars and humanoid robots have an increasingly prominent place in today's world. Accep- tance of these synthetic characters depends on their ability to properly and recognizably convey basic emotion states to a user population. This study presents an analysis of audio- visual features that can be used to predict user evaluations of synthetic character emotion displays. These features in- clude prosodic, spectral, and semantic properties of audio signals in addition to FACS-inspired video features [11]. The goal of this paper is to identify the audio-visual fea- tures that explain the variance in the emotional evaluations of na ̈ıve listeners through the utilization of information gain feature selection in conjunction with support vector ma- chines. These results suggest that there exists an emotion- ally salient subset of the audio-visual feature space. The features that contribute most to the explanation of evalua- tor variance are the prior knowledge audio statistics (e.g., average valence rating), the high energy band spectral com- ponents, and the quartile pitch range. This feature subset should be correctly modeled and implemented in the design of synthetic expressive displays to convey the desired emo- tions.},
	booktitle = {Proceedings of the {IEEE} {International} {Symposium} on {Multimedia}},
	author = {Mower, Emily and Mataric, Maja J. and Narayanan, Shrikanth},
	month = dec,
	year = {2008}
}
Powered by bibtexbrowser