University of Southern California

Spoken Language Processing

The Spoken Language Processing project develops software and speech-based prototypes to facilitate language processing between Virtual Human (VH) agents and humans, as well as human-to-human interactions such as language translation systems for soldiers. We are developing new algorithms for robust speech recognition, speaking style modeling notably linguistic and expressive prosody in speech, speech assessment for language training applications, speaker recognition, emotion detection as well as analysis and modeling of the interplay between verbal and nonverbal aspects of human communication.

Since spoken language is central to human communication, a wide range of users can benefit from the technology we are developing: in training and simulation environments, the technology enables the VH agents to understand human trainees' speech and their emotional state. In distributed, collaborative communications such as between soldiers using field radios, automatic speech transcription can help enhance the effectiveness of real-time communication and aid in after-action review. The multilingual translation capabilities can facilitate cross-cultural communication for medical, civil affairs, and other interactions with foreign language speakers. The ability to assess the quality of human speech and language provides new ways of facilitating automated language learning for the soldiers. These technologies can also index and interact with vast amounts of data by automatically marking who is talking, what they are saying, and when the conversation occurred, a capability that is important for rapid analysis for after action reviews. http://ict.usc.edu/media/sound3_lg.jpg

This project differs from others in the following ways:

  • It offers an integrated, interdisciplinary perspective on speech processing: what is being said (message content), who is saying it (message source) and how it is being said (message style, emotions).
  • It combines basic and applied speech research in novel ways by incorporating knowledge of how human produce and process speech into engineering system development.
  • It unifies the treatment of verbal and nonverbal communication.
  • It applies basic speech research to real problem scenarios and interaction data.
  • It focuses on the operational environments and communication scenarios unique to the military.
  • Contributes to human resource development by providing research training for graduate and undergraduate students.

Tags: human, language, speech, translation, virtual

View All Projects »

  • Develop speech acoustic models to enable robust automatic speech recognition (ASR)
  • Develop and implement algorithms to handle noisy environments and speaker variability, especially due to stress and emotions
  • Develop capability for rapid language modeling to handle new scenarios and to degrade gracefully in out-of-domain conditions
  • Devise and implement algorithms for identifying human emotions automatically from spoken language
  • Devise and implement algorithms to track who is talking and when they speak
  • Devise, implement, and integrate ASR capabilities into natural language processing, emotion, and dialogue models