University of Southern California

ISI Natural Language Understanding

The ISI Natural Language Understanding project builds natural language understanding (NLU) capabilities into Virtual Human (VH) agents. With these NLU capabilities, VH agents parse and generate English. The agents "parse" English by converting a string of English words (e.g., a sentence) into symbols that represent its meaning. Similarly, they "generate" English by converting these symbols back into a grammatically correct sequence of English words. For each Virtual Human agent, the parser converts the output of the speech recognizer to the input of the system's internal dialogue manager, and the generator converts the output of the dialogue manager to the input of the speech synthesizer. Here's a simplified diagram of the process:

The parser and generator incorporate both statistical (learning) and rule-based (manual) processing. In our experiments, we explore ways to combine these methods to overcome the weaknesses of one with the strengths of the other. The project team builds various finite-state and statistically-trained parsers for understanding, and template-based, phrase expansion, statistical generators for sentence creation.

This project differs from others in the following ways:

  • It includes both understanding and generation (most projects focus on one direction only)
  • It includes prosodic information for parsing long sentences (no other project we know of does this)
  • It explicitly combines statistical and rule-based components (most projects take one or the other approach exclusively)
  • It requires less training data than most projects of this kind, because of the rule-based methods employed

Tags: dialogue, human, language, meaning, natural, virtual

View All Projects »

  • Understand long, complex sentences -- Using prosodic information (the stress and intonation patterns of speech), long sentences can be divided into single-clause units and then parsed, which simplifies the parsing task and increases the amount of training data available for generation. The project team is finalizing the data exchange format definition for prosodic information with the Spoken Language Processing group.
  • Build training materials quickly -- Instead of basing training data on what trainees say, the project team will build an interface for model builders to enter their basic language needs (words and phrases). This method will both improve modeling consistency and provide a source of anchor core phrases and associated paraphrases. For example, the phrase "We can help you move your clinic" could be paraphrased as "we can assist you with the move of the clinic", "we can provide assistance with the clinic move", and "we can support your moving the clinic". To create paraphrases, the team will use a variety of methods including ISI's ontology Omega, which contains verb structure frames, and WordNet, a large lexical database with synonym information.