Speaker-Adaptive Multimodal Prediction Model for Listener Responses (bibtex)
by de Kok, Iwan, Heylen, Dirk and Morency, Louis-Philippe
Abstract:
The goal of this paper is to acknowledge and model the variability in speaking styles in dyadic interactions and build a predictive algorithm for listener responses that is able to adapt to these different styles. The end result of this research will be a virtual human able to automatically respond to a human speaker with proper listener responses (e.g., head nods). Our novel speaker-adaptive prediction model is created from a corpus of dyadic interactions where speaker variability is analyzed to identify a subset of prototypical speaker styles. During a live interaction our prediction model automatically identifies the closest prototypical speaker style and predicts listener responses based on this communicative style. Central to our approach is the idea of "speaker profile" which uniquely identify each speaker and enables the matching between prototypical speakers and new speakers. The paper shows the merits of our speaker-adaptive listener response prediction model by showing improvement over a state-of-the-art approach which does not adapt to the speaker. Besides the merits of speaker-adaptation, our experiments highlights the importance of using multimodal features when comparing speakers to select the closest prototypical speaker style.
Reference:
Speaker-Adaptive Multimodal Prediction Model for Listener Responses (de Kok, Iwan, Heylen, Dirk and Morency, Louis-Philippe), In , ACM Press, 2013.
Bibtex Entry:
@inproceedings{de_kok_speaker-adaptive_2013,
	address = {Sydney, Australia},
	title = {Speaker-{Adaptive} {Multimodal} {Prediction} {Model} for {Listener} {Responses}},
	isbn = {978-1-4503-2129-7},
	url = {http://ict.usc.edu/pubs/Speaker-adaptive%20multimodal%20prediction%20model%20for%20listener%20responses.pdf},
	doi = {10.1145/2522848.2522866},
	abstract = {The goal of this paper is to acknowledge and model the variability in speaking styles in dyadic interactions and build a
predictive algorithm for listener responses that is able to adapt to these different styles. The end result of this research will be a virtual human able to automatically respond to a human speaker with proper listener responses (e.g., head nods). Our novel speaker-adaptive prediction model is created from a corpus of dyadic interactions where speaker variability is analyzed to identify a subset of prototypical speaker styles. During a live interaction our prediction model automatically identifies the closest prototypical speaker style and predicts listener responses based on this communicative style. Central to our approach is the idea of "speaker profile" which uniquely identify each speaker and enables the matching between prototypical speakers and new speakers. The paper shows the merits of our speaker-adaptive listener response prediction model by showing improvement over a state-of-the-art approach which does not adapt to the speaker. Besides the merits of speaker-adaptation, our experiments highlights the importance of using multimodal features when comparing speakers to select the closest prototypical speaker style.},
	language = {en},
	publisher = {ACM Press},
	author = {de Kok, Iwan and Heylen, Dirk and Morency, Louis-Philippe},
	month = dec,
	year = {2013},
	keywords = {Virtual Humans, UARC},
	pages = {51--58}
}
Powered by bibtexbrowser