Louis-Philippe Morency: “Visual Feedback for Multimodal Interfaces”

February 9, 2007 | USC ICT

Speaker: Louis-Philippe Morency

When people interact with each other, it is common to see indications of acknowledgment given with a simple head gesture or explicit turn-taking with eye gaze shifts. People use visual feedback-visual information transferred during interaction-to communicate relevant information and to synchronize rhythm between participants. The recognition of visual feedback is a key component of human communication, and novel multimodal interfaces need to recognize and analyze these visual cues to facilitate more natural human-computer interaction.
In this talk, I will focus on two core technical challenges necessary to achieve efficient and robust visual feedback recognition: the use of contextual information to anticipate visual feedback, and the use of latent state models for visual gesture recognition. To recognize visual feedback efficiently, people often use contextual knowledge from previous and current events to anticipate when feedback is most likely to occur. For example, at the end of a sentence the speaker will often look at the listener and anticipate a head nod gesture to ground understanding. I will present a context-based recognition framework for analyzing online contextual knowledge from the interactive system and anticipate visual feedback from the human participant.

Recognizing natural visual feedback from human users is a challenging problem; natural visual gestures are subtle, can differ considerably between individuals, and are context driven. I will present new discriminative sequence models for visual gesture recognition which can model the sub-structure of a gesture sequence, can learn the dynamics between gesture labels and can be directly applied to label un-segmented sequences. These models outperform previous approaches (i.e. SVMs, HMMs, and CRFs) for visual gesture recognition and can efficiently learn relevant contextual information necessary for context-based recognition.