Programmer/Research Assistant Intern 0932
January 08, 2009
Project Name
Comparative Evaluation of Speech Recognizers for Conversational Speech
Project Description
Broad-coverage commercial speech recognizers have very good performance for dictated or read speech, but often do not perform well in limited, special purpose domains. Specialized Grammar-based approaches can have even higher performance for command and control tasks or small domains. Language model techniques can give more robust behavior but often require large amounts of in-domain material to create customized high-performance language models. Conversational speech is one of the toughest challenges for all of these systems, as the speech is influenced by local context and real-time cognitive load for production tasks. In this project, we will compare the performance of a number of available commercial and research platforms as to how well they perform in understanding the speech of people engaged in conversational tasks with virtual human conversational partners.
Job Description
The intern will develop software infrastructure to streamline the use and evaluation of automatic speech recognition (ASR) software in ICT virtual human projects. The intern will be expected to implement wrappers for existing ASR components (such as Nuance, CMU Sphinx, etc.), and develop scripts and computer programs that facilitate their rapid evaluation using existing data (sound files and corresponding transcripts) collected from ICT’s virtual human applications. Ideally, these ASR wrappers should implement an API suitable not only for evaluation, but also for eventual integration into ICT’s existing and future applications.
This internship will expose the intern to state-of-the-art applications of ASR technology and to the evaluation of competing natural language processing techniques, while providing to ICT a framework within which competing ASR components can be more easily compared and leveraged opportunistically in specific virtual human applications.
Skill set:
- Required skills
- a strong, well-developed ability to write scripts or computer programs that manipulate textual data (typically perl scripts, shell scripts, etc.; other programming languages such as java or c may be employed if appropriate)
- Ability to link existing software components together by writing scripts or program extensions to implement APIs
- Some familiarity with at least one speech recognition software package
- Desirable skills:
- Some experience with quantitative evaluation of software performance (using notions like precision, recall, f-score, word error rate, etc.)
- Knowledge of n-gram language modeling
- Basic knowledge of phonology