University of Southern California

Programmer/Research Assistant Intern 0933

January 08, 2009

Project Name

Automated Expansion of Domains for Question-Answering Characters


Project Description

ICT has developed a set of virtual humans that can robustly answer questions in a limited domain. Using current techniques, these domains must be constructed by people adding text answers and sample questions that must lead to these answers. This makes the process of expanding the domain a slow, manual process. In this project, we will experiment with automatic expansion of the domain by “reading” news material or web information so the character can “learn” more things to talk about. The project will be evaluated by two types of tests of interactive dialogue between people and virtual humans using both a fixed set of questions to which new domain answers can be found as well as unrestricted conversation which may involve the new material to different degrees.


Job Description

The intern will develop or modify information extraction and/or question answering software that can be applied to some web-based material (e.g. today’s newspaper, or information about a particular city), to extract and/or generate a set of textual questions and answers. The answers are text that a character might respond with, and the questions are a set of common questions that people might use to ask about this material and trigger the answer. The intern will also assist with the evaluation, preparing the virtual humans, participating in the detailed experimental design.

Skill set:
- Knowledge of natural language engineering techniques including information retrieval, information extraction and/or question answering.
- Ability to write scripts to access web materials such as RSS feeds or common knowledge sources (e.g. wikipedia)
- Strong programming skills in Java and/or scripting languages.


Back to 2009 Internship list.

Back to Application Form