Morency, L.P., de Kok, I., & Gratch, J.
10th International Conference on Multimodal Interfaces (ICMI 2008)
(Crete, Greece, October 20, 2008)
Read Abstract »
During face-to-face conversation, people use visual feedback such as head nods to communicate relevant information and to synchronize rhythm between participants. In this paperwe describe how contextual information from other participants can be used to predict visual feedback and improve recognition of head gestures in human-human interactions. The main challenges addressed in this paper are optimal feature representation using an encoding dictionary and automatic selection of the optimal feature-encoding pairs. We evaluate our approach on a dataset involving 78 human participants. Using a discriminative approach to multi-modal integration, our context-based recognizer significantly improves head gesture recognition performance over a vision-only recognizer.
Fullerton, T., Morie, J., & Pearce, C.
Fibreculture Journal: Internet Theory, Criticism and Research, Issue 11,
(May 2008)
Read Abstract »
The techno-fetishism of computer game culture has lead to a predominately male sensibility towards the construction of space in digital entertainment. Real-time strategy games conceive of space as a domain to be conquered; first-person shooters create labyrinthine battlefields in which space becomes a context for combat. Massively multiplayer games offer the opportunity for non-linear exploration, but emphasize linear achievement within a combat-based narrative. In this paper, we argue for a new gendered, regendered and perhaps degendered poetics of game space, rethinking ways in which space is conceptualized and represented as a domain for play. We argue for a more egalitarian virtual playground that acknowledges and embraces a wider range of spatial and cognitive models, referencing literature, philosophy, fine art and non-digital games for inspiration. Reflecting on a variety of sources, beginning with Virginia Woolf’s A Room of One’s Own and Bachelard’s Poetics of Space, feminist writings of Charlotte Gilman Perkins, Simone de Beauvoir, Hélène Cixous, Judith Butler, Janet Murray, and including contemporary game writers such as Lizbeth Klastrup, Mary Flanagan, Maia Engeli, and T.L. Taylor, we will argue for a new gendered poetics of game space, proposing an inclusionary approach that integrates feminine conceptions of space into the gaming landscape.
Hartholt, A., Russ, T., Traum, D., Hovy, E., & Robinson, S.
LREC 2008 - 6th Language Resources and Evaluation Conference
(Marrakech, Morocco, May 30, 2008)
Read Abstract »
When dealing with large, distributed systems that use state-of-the-art components, individual components are usually developed in parallel. As development continues, the decoupling invariably leads to a mismatch between how these components internally represent concepts and how they communicate these representations to other components: representations can get out of synch, contain localized errors, or become manageable only by a small group of experts for each module. In this paper, we describe the use of an ontology as part of a complex distributed virtual human architecture in order to enable better communication between modules while improving the overall flexibility needed to change or extend the system. We focus on the natural language understanding capabilities of this architecture and the relationship between language and concepts within the entire system in general and the ontology in particular.
Kenny, P., Parsons, T., Gratch, J., & Rizzo, A.
PETRA Conference proceedings published by ACM
(July 16, 2008)
Read Abstract »
There is a growing need for applications that can dynamically interact with aging populations to gather information, monitor their health care, provide information, or even act as companions. Virtual human agents or virtual characters offer a technology that can enable human users to overcome the confusing interfaces found in current human-computer interactions. These artificially intelligent virtual characters have speech recognition, natural language and vision that will allow human users to interact with their computers in a more natural way. Additionally, sensors may be used to monitor the environment for specific behaviors that can be fused into a virtual human system. As a result, the virtual human may respond to a patient or elderly person in a manner that will have a powerful affect on their living situation. This paper will describe the virtual human technology developed and some current applications that apply the technology to virtual patients for mental health diagnosis and clinician training. Additionally the paper will discuss possible ways in which the virtual humans may be utilized for assisted health care and for the integration of multi-modal input to enhance the virtual human system.
Novielli, N., Carnevale, P., & Gratch, J.
LREC2008 Workshop on Corpora for Research on Emotion and Affect
(Marrakech, Morocco, May 27, 2008)
Read Abstract »
We propose an annotation scheme for a corpus of negotiation dialogs that was collected in the scope of a study about the effect of negotiation attitudes and time pressure on dialog patterns.
Bolas, M., Lange, B., Dallas, I., Huerta, A., & Rizzo, A.
Virtual Rehabilitation 2008
(Vancouver, Canada, August 25-27, 2008)
Read Abstract »
The aim of this project was to make breathing exercises for children with Cystic Fibrosis fun. We developed a prototype device that uses breathing to control specifically designed video games.
Sagae, K., & Tsujii, J.
The 22nd International Conference on Computational Linguistics (Coling 2008)
(Manchester, UK, August 20, 2008)
Read Abstract »
Most data-driven dependency parsing approaches assume that sentence structure is represented as trees. Although trees have several desirable properties from both computational and linguistic perspectives, the structure of linguistic phenomena that goes beyond shallow syntax often cannot be fully captured by tree representations. We present a parsing approach that is nearly as simple as current data-driven transition-based dependency parsing frameworks, but outputs directed acyclic graphs (DAGs). We demonstrate the benefits of DAG parsing in two experiments where its advantages over dependency tree parsing can be clearly observed: predicate-argument analysis of English and syntactic analysis of Danish with a representation that includes long-distance dependencies and anaphoric reference links.
Swanson, R., & Gordon, A.
First International Conference on Interactive Digital Storytelling
(Erfurt, Germany, November 26-29, 2008)
Read Abstract »
Interactive storytelling is an interesting cross-disciplinary area that has importance in research as well as entertainment. In this paper we explore a new area of interactive storytelling that blurs the line between traditional interactive fiction and collaborative writing. We present a system where the user and computer take turns in writing sentences of a fictional narrative. Sentences contributed by the computer are selected from a collection of millions of stories extracted from Internet weblogs. By leveraging the large amounts of personal narrative content available on the web, we show that even with a simple approach our system can produce compelling stories with our users.
Kenny, P., Parsons, T., Gratch, J., & Rizzo, A.
Intelligent Virtual Agents of Lecture Notes in Computer Science, 5208/2008, 394–408, Springer.
(August 25, 2008)
Read Abstract »
Recent research has established the potential for virtual characters to act as virtual standardized patients VP for the assessment and training of novice clinicians. We hypothesize that the responses of a VP simulating Post Traumatic Stress Disorder (PTSD) in an adolescent female could elicit a number of diagnostic mental health specific questions (from novice clinicians) that are necessary for differential diagnosis of the condition. Composites were developed to reflect the relation between novice clinician questions and VP responses. The primary goal in this study was evaluative: can a VP generate responses that elicit user questions relevant for PTSD categorization? A secondary goal was to investigate the impact of psychological variables upon the resulting VP Question/Response composites and the overall believability of the system.
Gomboc, D., Lane, H.C., Core, M., Karnavat, A., Auerbach, D., and Rosenberg, M.
Proceedings of the Twenty-First International Florida Artificial Intelligence Research Society Conference (FLAIRS 2008)
(Coconut Grove, Florida, May 15, 2008)
Read Abstract »
Truly generic and reusable intelligent tutoring software architectures have remained elusive. As part of our effort to develop tutoring systems for simulations of ill-defined domains, a software framework has emerged with minimal dependencies on any domain-specific details, except for the data used to instantiate the framework for a particular application. Here, we describe this framework, its functionality, underlying representations,configurability, and use of natural language generation.
Wang, N., Marsella, S., & Hawkins, T.
The Seventh International Conference on Autonomous Agents and Multiagent Systems
(Estorial, Portugal, May 12-16, 2008)
Read Abstract »
To create realistic and expressive virtual humans, we need to develop better models of the processes and dynamics of human emotions and expressions. A first step in this effort is to develop means to systematically induce and capture realistic expressions in real humans. We conducted a series of studies on human emotions and facial expression using the Emotion Evoking Game (EVG) and a high-speed video camera. In this paper, we discuss a detailed analysis of facial expressions in response to a surprise situation. We provide details on the rich dynamics of facial expressions, along with data useful for animation of virtual human. The analysis of the data also revealed considerable individual differences in whether surprise was evoked and how it was expressed.
van Velsen, M.
8th International Conference on Intelligent Virtual Agents
(Tokyo, Japan, September 1-3, 2008)
Read Abstract »
In this paper we present an authoring tool called Narratoria that allows non-technical experts in the field of digital entertainment to create interactive narratives with 3D graphics and multimedia. Narratoria allows experts in digital entertainment to participate in the generation of story-based military training applications. Users of the tools can create story-arcs, screenplays, pedagogical goals and AI models using a single software application. Using commercial game engines, which provide direct visual output in a real-time feedback-loop, users can view the final product as they edit.
Poesio, M., and Artstein, R.
Language Resources and Evaluation Conference (LREC 2008)
(Marrakech, Morocco, May 29, 2008)
Read Abstract »
Arrau is a new corpus annotated for anaphoric relations, with information about agreement and explicit representation of multiple antecedents for ambiguous anaphoric expressions and discourse antecedents for expressions which refer to abstract entities such as events, actions and plans. The corpus contains texts from different genres: task-oriented dialogues from the Trains-91 and Trains-93 corpus, narratives from the English Pear Stories corpus, newspaper articles from the Wall Street Journal portion of the Penn Treebank, and mixed text from the Gnome corpus.
Wang, N., and Johnson, W.L.
9th International Conference on Intelligent Tutoring Systems
(Montreal, Canada, June 23 - 27, 2008)
Read Abstract »
When applying Reeves and Nass’s Media Equation [22] to pedagogical agent research, we seek to focus on the manner in which a pedagogical agent communicates with learners. Previous study showed that pedagogical agents offer feedback with appropriate politeness strategies can help students learn better [23]. Other study failed to replicate this Politeness Effect in real classroom learning environment [18]. The work presented here investigated the Politeness Effect in a foreign language intelligent tutoring system. Results show that tutorial feedback with socially intelligent strategies can influence motivation and learning outcomes, depending upon the extent to which the learning environment allows for the possibility of affecting learner motivational factors.
Kang, S., Gratch, J., & Wang, N.
Intelligent Virtual Agent 2008
(Tokyo, Japan, 09/02/2008)
Read Abstract »
This study explored associations between the five-factor personality traits of human subjects and their feelings of rapport when they interacted with virtual agent or real humans. The agent, the Rapport Agent, responded to real human speakers’ storytelling behavior, using contingent, but only nonverbal feedback. We further investigated how interactants’ personalities were related to the three components of rapport: positivity, attentiveness, and coordination. The results revealed that more agreeable people showed strong self-reported rapport and weak behavioral-measured rapport in the disfluency dimension with the Rapport Agent, while showing no significant associations between agreeableness and both people’s self-reported rapport and the disfluency dimension with real humans. The conclusions provide fundamental data to further develop the rapport theory that would contribute to evaluating and enhancing the interactional fidelity of an agent on the design of virtual humans for social skills training and therapy.
Gandhe, S., DeVault, D., Roque, A., Artstein, R., Leuski, A., Gerten, J., Traum, D., & Martinovski, B.
Interspeech 2008
(Brisbane, Australia, 9/26/2008)
Read Abstract »
We present a new approach for rapidly developing dialogue capabilities for virtual humans. Starting from domain specification, an integrated authoring interface automatically generates dialogue acts with all possible contents. These dialogue acts are linked to example utterances in order to provide training data for natural language understanding and generation. The virtual human dialogue system contains a dialogue manager following the information-state approach, using finite-state machines and SCXML to manage local coherence, as well as explicit modeling of emotions and compliance level and a grounding component based on evidence of understanding. Using the authoring tools, we design and implement a version of the virtual human Hassan and compare to previous architectures for the character.
Carre, D. & Levasseur, M.
Institute for Creative Technologies, ICT-TR-03-2008,
(Marina del Rey, CA, September 17, 2007 - December 6, 2007)
Read Abstract »
Rapport between people and virtual human agents is not limited to just speech. There are many non-verbal behaviors such as gestures or facial expressions that can express feelings or convey a message. One of the challenges in making an agent appear more realistic is to make his non-verbal behaviors appear more natural. To accomplish this, it is essential to find out how and when gestures are performed.
In order to determine how gestures are performed, it is necessary to assess different appearances of the same gesture and the mapping between their respective function.
To determine when gestures are performed, the key is to find relevant contextual features and their links with gestures, which will lead to the prediction of the moment they should be performed.
Finally, both of these issues can now be tackled with the provided toolbox. Preliminary results show that we have some gesture pattern. Beside, we were able, based on contextual features, to predict when the agent should nod his head. Early results appear to show the agent nods at an opportune time. Moreover, this toolbox generalizes the results to other kind of gestures than head nods, which is the goal of this study.
de Kok, I.
Institute for Creative Technologies, ICT-TR-02-2008,
(Marina del Rey, CA, June 20, 2008)
Read Abstract »
In this report I will document the work I have done during my internship at Institute for Creative Technologies from 22 January to 25 April under supervision of Louis-Phillipe Morency. During this time I have done research in the field of virtual humans, more specically in the field of predicting and producing listener backchannels. But more on that later. I will start this report with a little background about the Institute for Creative Technologies and the project group which I was part of. After this the goal of my internship will be explained in Section 2. A general overview of our approach of achieving the goals set in Section 2 will be explained in Section 3. A more detailed description of the dierent steps taken will be given in Section 4. Following on that the results of the conducted research will be presented in Section 5. Finally a discussion of the work done, recommendations for improvement and future work will be given in Section 6.
Roque, A., & Traum, D.
9th SIGdial Workshop on Discourse and Dialogue
(Columbus, OH, June 19-20, 2008)
Read Abstract »
We introduce the Degrees of Grounding model, which defines the extent to which material being discussed in a dialogue has been grounded. This model has been developed and evaluated by a corpus analysis, and includes a set of types of evidence of understanding, a set of degrees of groundedness, a set of grounding criteria, and methods for identifying each of these. We describe how this model can be used for dialogue management.
Morency, L.P., de Kok, I., & Gratch, J.
Conference on Intelligent Virtual Agents (IVA 2008)
(Tokyo, Japan, September 1-3, 2008)
Read Abstract »
During face-to-face interactions, listeners use backchannel feedback such as head nods as a signal to the speaker that the communication is working and that they should continue speaking. Predicting these backchannel opportunities is an important milestone for building engaging and natural virtual humans. In this paper we show how sequential probabilistic models (e.g., Hidden Markov Model or Conditional Random Fields) can automatically learn from a database of human-to-human interactions to predict listener backchannels using the speaker multimodal output features (e.g., prosody, spoken words and eye gaze). The main challenges addressed in this paper are automatic selection of the relevant features and optimal feature representation for probabilistic models. For prediction of visual backchannel cues (i.e., head nods), our prediction model shows a statistically significant improvement over a previously published approach based on hand-crafted rules.
Morency, L.P., Whitehill, J., & Movellan, J.
8th International Conference on Automatic Face and Gesture Recognition (FG 2008)
(Amsterdam, The Netherlands, September 17-19, 2008)
Read Abstract »
Accurately estimating the person’s head position and orientation is an important task for a wide range of applications such as driver awareness and human-robot interaction. Over the past two decades, many approaches have been suggested to solve this problem, each with its own advantages and disadvantages. In this paper, we present a probabilistic framework called Generalized Adaptive View based Appearance Model (GAVAM) which integrates the advantages from three of these approaches: (1) the automatic initialization and stability of static head pose estimation, (2) the relative precision and user-independence of differential registration, and (3) the robustness and bounded drift of keyframe tracking. In our experiments, we show how the GAVAM model can be used to estimate head position and orientation in real-time using a simple monocular camera. Our experiments on two previously published datasets show that the GAVAM framework can accurately track for a long period of time (>2 minutes) with an average accuracy of 3.5◦ and 0.75in with an inertial sensor and a 3D magnetic sensor.
Sun, X., Morency, L.P., Okanohara, D., & Tsujii, J.
The 22nd International Conference on Computational Linguistics (COLING 2008)
(Manchester, United Kingdom, August 18, 2008)
Read Abstract »
Shallow parsing is one of many NLP tasks that can be reduced to a sequence labeling problem. In this paper we show that the latent-dynamics (i.e., hidden substructure of shallow phrases) constitutes a problem in shallow parsing, and we show that modeling this intermediate structure is useful. By analyzing the automatically learned hidden states, we show how the latent conditional model explicitly learn latent-dynamics. We propose in this paper the Best Label Path (BLP) inference algorithm, which is able to produce the most probable label sequence on latent conditional models. It outperforms two existing inference algorithms. With the BLP inference, the LDCRF model significantly outperforms CRF models on word features, and achieves comparable performance of the most successful shallow parsers on the CoNLL data when further using part-ofspeech features.
Bulitko, V., Solomon, S., Gratch, J., & van Lent, M.
The 10th International Conference on the Simulation of Adaptive Behavior (SAB); Workshop on the role of emotion in adaptive behavior and cognitive robotics.
(Osaka, Japan, July 11, 2008)
Read Abstract »
Culture and emotions have a profound impact on human behavior. Consequently, high-fidelity simulated interactive environments (e.g., trainers and computer games) that involve virtual humans must model socio-cultural and emotional affects on agent behavior. In this paper we discuss two recently fielded systems that do so independently: Culturally Affected Behavior (CAB) and EMotion and Adaptation (EMA). We then propose a simple language that combines the two systems in a natural way thereby enabling simultaneous simulation of culturally and emotionally affected behavior. The proposed language is based on matrix algebra and can be easily implemented on single- or multi-core hardware with a standard matrix package (e.g., MATLAB or a C++ library). We then show how to extend the combined culture and emotion model with an explicit representation of religion and personality profiles.
Parsons, T.D., & Rizzo, A.A.
CyberPsychology & Behavior, 11, 1, 16-24
(2008)
Read Abstract »
The current project is an initial attempt at validating the Virtual Reality Cognitive Performance Assessment Test (VRCPAT), a virtual environment–based measure of learning and memory. To examine convergent and discriminant validity, a multitrait–multimethod matrix was used in which we hypothesized that the VRCPAT’s total learning and memory scores would correlate with other neuropsychological measures involving learning and memory but not with measures involving potential confounds (i.e., executive functions; attention; processing speed; and verbal fluency). Using a sequential hierarchical strategy, each stage of test development did not proceed until specified criteria were met. The 15-minute VRCPAT battery and a 1.5-hour in-person neuropsychological assessment were conducted with a sample of 30 healthy adults, between the ages of 21 and 36, that included equivalent distributions of men and women from ethnically diverse populations. Results supported both convergent and discriminant validity. That is, findings suggest that the VRCPAT measures a capacity that is (a) consistent with that assessed by traditional paper-and-pencil measures involving learning and memory and (b) inconsistent with that assessed by traditional paper-and-pencil measures assessing neurocognitive domains traditionally assumed to be other than learning and memory. We conclude that the VRCPAT is a valid test that provides a unique opportunity to reliably and efficiently study memory function within an ecologically valid environment.
Parsons, T.D., & Rizzo, A.A.
Journal of Behavior Therapy and Experimental Psychiatry, 39, 250-261
(2008)
Read Abstract »
Virtual reality exposure therapy (VRET) is an increasingly common treatment for anxiety and specific phobias. Lacking is a quantitative meta-analysis that enhances understanding of the variability and clinical significance of anxiety reduction outcomes after VRET. Searches of electronic databases yielded 52 studies, and of these, 21 studies (300 subjects) met inclusion criteria. Although meta-analysis revealed large declines in anxiety symptoms following VRET, moderator analyses were limited due to inconsistent reporting in the VRET literature. This highlights the need for future research studies that report uniform and detailed information regarding presence, immersion, anxiety and/or phobia duration, and demographics.
Gordon, A.
International Conference on Knowledge Management, Special Track on Intelligent Assistance for Self-Directed and Organizational Learning.
(Graz, Austria, September 3-5, 2008)
Read Abstract »
The stories told among members of an organization are an effective instrument for knowledge socialization, the sharing of experiences through social mechanisms. However, the utility of stories for organizational learning is limited due to the difficulties in acquiring stories that are relevant to the practices of an organization, identifying the learning goals that these stories serve, and delivering these stories to the right people and the right time in a manner that best facilitates learning. In this paper we outline a vision for story-based organizational learning in the future, and describe three areas where intelligent technologies can be applied to automate story management practices in support of organizational learning. First, we describe automated story capture technologies that identify narratives of people’s experiences within the context of a larger discourse. Second, we describe automated retrieval technologies that identify stories that are relevant to specific educational needs. Third, we describe how stories can be transformed into effective story-based learning environments with minimal development costs.
Gordon, A., & Swanson, R.
International Conference on New Media Technology, Special Track on Knowledge Acquisition From the Social Web.
(Graz, Austria, September 3-5, 2008)
Read Abstract »
In this position paper we present a vision of how the stories that people tell in Internet weblogs can be used directly for automated commonsense reasoning, specifically to support the core envisionment functions of event prediction, explanation, and imagination.
Gordon, A., Havasi, C., Lux, M., & Strohmaier, M.
2008 International Conference on Intelligent User Interfaces
(Canary Islands, Spain, January 13-16, 2007)
Read Abstract »
We present an overview of the workshop on Common Sense Knowledge and Goal-Oriented Interfaces held at the 2008 Intelligent User Interfaces conference. Six papers were accepted from diverse research groups, each offering innovative new research on interfaces that incorporate common sense knowledge and that are oriented around the goals of their users.
Artstein, R., Gandhe, S., Leuski, A., & Traum, D.
ELRA Workshop on Evaluation
(Marrakech, Morocco, 5/27/08)
Read Abstract »
We tested a life-size embodied question-answering character at a convention where he responded to questions from the audience. The character’s responses were then rated for coherence. The ratings, combined with speech transcripts, speech recognition results and the character’s responses, allowed us to identify where the character needs to improve, namely in speech recognition and providing off-topic responses.
Morie, J.
SPIE Electronic Imaging: The Engineering Reality of Virtual Reality 2008
(01/31/2008)
Read Abstract »
The idea of Virtual Reality once conjured up visions of new territories to explore, and expectations of awaiting worlds of wonder. VR has matured to become a practical tool for therapy, medicine and commercial interests, yet artists, in particular, continue to expand the possibilities for the medium. Artistic virtual environments created over the past two decades probe the phenomenological nature of these virtual environments. When we inhabit a fully immersive virtual environment, we have entered into a new form of Being. Not only does our body continue to exist in the real, physical world, we are also embodied within the virtual by means of technology that translates our bodied actions into interactions with the virtual environment. Very few states in human existence allow this bifurcation of our Being, where we can exist simultaneously in two spaces at once, with the possible exception of meta-physical states such as shamanistic trance and out-of-body experiences. This paper discusses the nature of this simultaneous Being, how we enter the virtual space, what forms of persona we can don there, what forms of spaces we can inhabit, and what type of wondrous experiences we can both hope for and expect.
Parsons, T., Kenny, P., Ntuen, C., Pataki, C., Pato, M., Rizzo, A., St-George, C., & Sugar, J.
Medicine Meets Virtual Reality Conference
(Long Beach, CA, February 2008)
Read Abstract »
Effective interview skills are a core competency for psychiatry residents and developing psychotherapists. Although schools commonly make use of standardized patients to teach interview skills, the diversity of the scenarios standardized patients can characterize is limited by availability of human actors. Further, there is the economic concern related to the time and money needed to train standardized patients. Perhaps most damaging is the “standardization” of standardized patients—will they in fact consistently proffer psychometrically reliable and valid interactions with the training clinicians. Virtual Human Agent (VHA) technology has evolved to a point where researchers may begin developing mental health applications that make use of virtual reality patients. The work presented here is a preliminary attempt at what we believe to be a large application area. Herein we describe an ongoing study of our virtual patients (VP). We present an approach that allows novice mental health clinicians to conduct an interview with a virtual character that emulates an adolescent male with conduct disorder. This study illustrates the ways in which a variety of core research components developed at the University of Southern California facilitates the rapid development of mental health applications.
van Velsen, Martin.
The Florida Artificial Intelligence Research Society (FLAIRS)
(Key West, Florida, 5/15/2008)
Read Abstract »
In this paper we present an authoring tool called Narratoria that allows non-technical experts in the field of digital entertainment to create interactive narratives with 3D graphics and multimedia. Narratoria allows experts in digital entertainment to participate in the generation of story-based military training applications. Users of the tools can create story-arcs, screenplays, pedagogical goals and AI models using a single software application. Using game engines, which provide direct visual output in a real-time feedback-loop, users can view the final product as they edit.
Gordon, A., Hobbs, J. & Cox, M.
AAAI Workshop on Metareasoning: Thinking about thinking.
(Chicago, IL, July 13-14, 2008)
Read Abstract »
Representations of an AI agent’s mental states and processes are necessary to enable metareasoning, i.e., thinking about thinking. However, the formulation of suitable representations remains an outstanding AI research challenge, with no clear consensus on how to proceed. This paper outlines an approach involving the formulation of anthropomorphic self-models, where the representations that are used for metareasoning are based on formalizations of commonsense psychology. We describe two research activities that support this approach, the formalization of broad-coverage commonsense psychology theories and use of representations in the monitoring and control of object level reasoning. We focus specifically on metareasoning about memory, but argue that anthropomorphic self-models support the development of integrated, reusable, broadcoverage representations for use in metareasoning systems.
Hobbs, J. & Gordon, A.
Workshop on Sentiment Analysis: Emotion, Metaphor, Ontology and Terminology (EMOT-08), 6th International Conference on Language Resources and Evaluation (LREC-08)
(Marrakech, Morocco, May 27, 2008)
Read Abstract »
We understand discourse so well because we know so much. If we are to have natural language understanding systems that are able to deal with texts with emotional content, we must encode knowledge of human emotions for use in the systems. In particular, we must equip the system with a formal version of people’s implicit theory of how emotions mediate between what they experience and what they do, and rules that link the theory with words and phrases in the emotional lexicon. The effort we describe here is part of a larger project in knowledge-based natural language understanding to construct a collection of abstract and concrete core formal theories of fundamental phenomena, geared to language, and to define or at least characterize the most common words in English in terms of these theories (Hobbs, 2008). One collection of theories we have put a considerable amount of work into is a commonsense theory of human cognition, or how people think they think (Hobbs and Gordon, 2005). A formal theory of emotions is an important piece of this. In this paper we describe this theory and our efforts to define a number of the most common words about emotions in terms of this and other theories. Vocabulary related to emotions has been studied extensively within the field of linguistics, with particular attention to cross-cultural differences (Athanasiadou and Tabakowska, 1998; Harkins and Wierzbicka, 2001; Wierzbicka, 1999). Within computational linguistics, there has been recent interest in creating large-scale text corpora where expressions of emotion and other private states are annotated (Wiebe et al., 2005). In Section 2 we describe Core WordNet and our categorization of it to determine the most frequent words about cognition and emotion. In Section 3 we describe an effort to flesh out the emotional lexicon by searching a large corpus for emotional terms, so we can have some assurance of high coverage in both the core theory and the lexical items linked to it. In Section 4 we sketch the principal facets of some of the core theories. In Section 5 we describe the theory of Emotion with several examples of words characterized in terms of the theories.
Kang, S., Gratch, J., Wang, N., & Watt, J.
7th International Conference on Autonomous Agents and Multiagent Systems
(Estoril, Portugal, May 2008)
Read Abstract »
We explored the association between users’ social anxiety and the interactional fidelity of an agent (also referred to as a virtual human), specifically addressing whether the contingency of agents’ nonverbal feedback affects the relationship between users’ social anxiety and their feelings of rapport, performance, or judgment on interaction partners. This subject was examined across four experimental conditions where participants interacted with three different types of agents and a real human. The three types of agents included the Non-Contingent Agent, the Responsive Agent (opposite to the Non-Contingent Agent), and the Mediated Agent (controlled by a real human). The results indicated that people having greater social anxiety would feel less rapport and show worse performance while feeling more embarrassment if they experience the untimely feedback of the Non-Contingent Agent. The results also showed people having more anxiety would trust real humans less as their interaction partners. We discuss the implication of this relationship between social anxiety in a human subject and the interactional fidelity of an agent on the design of virtual characters for social skills training and therapy.
Solomon, S., van Lent, M., Core, M., Carpenter, P., & Rosenberg, M.
Proceedings of the 17th Conference on Behavior Representation in Modeling and Simulation (BRIMS 2008),
(Providence, RI, April 2008)
Read Abstract »
Increasingly, the military has requirements for teaching cultural awareness, which demands flexible representations of cultural knowledge. The Culturally-Affected Behavior project seeks to define a language for encoding ethnographic data in order to capture cultural knowledge and use that knowledge to affect human behavior models. Having anthropologists encode ethnographic data will validate the language and will result in a library of culture models for immersive training.
Manshadi M., Swanson, R., Gordon, A.
Twenty-first International Conference of the Florida AI Society, Applied Natural Language Processing track
(Coconut Grove, FL, May 15-17, 2008)
Read Abstract »
One of the central problems in building broad-coverage
story understanding systems is generating expectations
about event sequences, i.e. predicting what happens next
given some arbitrary narrative context. In this paper, we
describe how a large corpus of stories extracted from
Internet weblogs was used to learn a probabilistic model of
event sequences using statistical language modeling
techniques. Our approach was to encode weblog stories as
sequences of events, one per sentence in the story, where
each event was represented as a pair of descriptive key
words extracted from the sentence. We then applied
statistical language modeling techniques to each of the event
sequences in the corpus. We evaluated the utility of the
resulting model for the tasks of narrative event ordering and
event prediction.
Gordon, A., Swanson, R.
International Conference on Weblogs and Social Media
(3/31/2008)
Read Abstract »
The phenomenal rise of Internet weblogging has created
new opportunities for people to tell personal stories of their
life experience, and the potential to share these stories with
those who can most benefit from reading them. One barrier
to this new mode of storytelling is the lack of accessibility;
existing Internet search tools are not tailored to the unique
characteristics of this textual genre. In this paper we
describe our efforts to develop a search engine specifically
for the stories that appear in Internet weblogs, called
StoryUpgrade. This application utilizes statistical text
classification technologies to separate story content from
other text in weblog entries, and facilitates searches for
stories that are related to particular activities of interest.
Swanson, R., Chew, E., Gordon, A.
AAAI Spring Symposium Series
(Stanford University, March 26-28, 2008)
Read Abstract »
Music and language are two human activities that fit well with a traditional notion of creativity and are particularly suited to computational exploration. In this paper we will argue for the necessity of syntactic processing in musical applications. Unsupervised methods offer uniquely interesting approaches to supporting creativity. We will demonstrate using the Constituent Context Model that syntactic structure of musical melodies can be learned automatically without annotated training data. Using a corpus built from the Well Tempered Clavier by Bach we describe a simple classification experiment that shows the relative quality of the induced parse trees for musical melodies.
Lane, H., Core, M., Gomboc, D., Karnavat, A., Rosenberg, M.
I/ITSEC
(Orlando, Florida, November 2007)
Read Abstract »
We describe some key issues involved in building an intelligent tutoring system for the ill-defined domain of interpersonal and intercultural skill acquisition. We discuss the consideration of mixed-result actions (actions with pros and cons), categories of actions (e.g. required steps vs. rules of thumb), the role of narrative, and reflective tutoring, among other topics. We present these ideas in the context of our work on an intelligent tutor for ELECT BiLAT, a game-based system to teach cultural awareness and negotiation skills for bilateral engagements. The tutor provides guidance in two forms: (1) as a coach that gives hints and feedback during an engagement with a virtual character, and (2) during an after-action review to help the learner reflect on their choices. Learner activities are mapped to learning objectives, which include whether the actions represent positive or negative evidence of learning. These underlie an expert model, student model, and models of coaching and reflective tutoring that support the learner. We describe several other cultural and interpersonal training systems that situate learners in goal based social contexts that include interaction with virtual characters and automated guidance. Finally, our future work includes evaluations of learning, expansion of the coach and reflective tutoring strategies, and integration of deeper knowledge-based resources that capture more nuanced cultural aspects of interaction.
Gordon, A., Cao, Q., Swanson, R.
Proceedings of the Fourth International Conference on Knowledge Capture
(Whistler, BC, October 28-31, 2007)
Read Abstract »
Among the most interesting ways that people share knowledge is through the telling of stories, i.e. first-person narratives about real life experiences. Millions of these stories appear in Internet weblogs, offering a potentially valuable resource for future knowledge management and training applications. In this paper we describe efforts to automatically capture stories from Internet weblogs by extracting them using statistical text classification techniques. We evaluate the precision and recall performance of competing approaches. We describe the large-scale application of story extraction technology to Internet weblogs, producing a corpus of stories with over a billion words.
Peers, P., Tamura, N., Matusik, W., Debevec, P.
ACM Transactions on Graphics (Proceedings of SIGGRAPH 2007)
(San Diego, CA, August 2007)
Read Abstract » | Read More »
We propose a novel post-production facial performance relighting system for human actors. Our system uses just a dataset of view-dependent facial appearances with a neutral expression, captured for a static subject using a Light Stage apparatus. For the actual performance, however, a potentially different actor is captured under known, but static, illumination. During post-production, the reflectance field of the reference dataset actor is transferred onto the dynamic performance, enabling image-based relighting of the entire sequence. Our approach makes post-production relighting more practical and could easily be incorporated in a traditional production pipeline since it does not require additional hardware during principal photography. Additionally, we show that our system is suitable for real-time post-production illumination editing.
Ma, W., Hawkins, T., Peers, P., Chabert, C., Weiss, M., Debevec, P.
Conference Proceeding
Read Abstract » | Read More »
We estimate surface normal maps of an object from either its diffuse or specular reflectance using four spherical gradient illumination patterns. In contrast to traditional photometric stereo, the spherical patterns allow normals to be estimated simultaneously from any number of viewpoints. We present two polarized lighting techniques that allow the diffuse and specular normal maps of an object to be measured independently. For scattering materials, we show that the specular normal maps yield the best record of detailed surface shape while the diffuse normals deviate from the true surface normal due to subsurface scattering, and that this effect is dependent on wavelength. We show several applications of this acquisition technique. First, we capture normal maps of a facial performance simultaneously from several viewing positions using time-multiplexed illumination. Second, we show that highresolution normal maps based on the specular component can be used with structured light 3D scanning to quickly acquire high-resolution facial surface geometry using off-the-shelf digital still cameras. Finally, we present a realtime shading model that uses independently estimated normal maps for the specular and diffuse color channels to reproduce some of the perceptually important effects of subsurface scattering.
Paek, T., Gandhe, S., Chickering, D., Ju, Y.
Proceedings of ACL 2007 Speechgram Workshop: Grammar-based approaches to Spoken Language Processing
(Prague, June 2007)
Read Abstract »
In command and control (C&C) speech interaction, users interact by speaking commands or asking questions typically specified in a context-free grammar (CFG). Unfortunately, users often produce out-of-grammar (OOG) command, which can result in misunderstanding or non-understanding. We explore a simple approach to handling OOG commands that involves generating a backoff grammar from any CFG using filler models, and utilizing that grammar for recognition whenever the CFG fails. Working within the memory footprint requirements of a mobile C&C product, applying th approach yielded a 35% relative reduction in semantic error rate for OOG commands. It also improve partial recognitions for enabling clarification dialogue.
Jan, D., Traum, D.
Proceedings of ACL 2007 workshop on Embodied Language Processing
(Prague, Czech Republic, June 2007)
Read Abstract »
For embodied agents to engage in realistic multiparty conversation, they must stand in appropriate places with respect to other agents and the environment. When these factors change, for example when an agent joins a conversation, the agents must dynamically move to a new location and/or orientation to accommodate. This paper presents an algorithm for simulating the movement of agents based on observed human behavior using techniques developed for pedestrian movement in crowd simulations. We extend a previous group conversation simulation to include an agent motion algorithm. We examine several test cases and show how the simulation generates results that mirror real-life conversation settings.
Gordon, A., Swanson, R.
Proceedings of the 2007 meeting of the Association for Computational Linguistics (ACL-07)
(Prague, Czech Republic, June 2007)
Read Abstract »
Large corpora of parsed sentences with semantic role labels (e.g. PropBank) provide training data for use in the creation of high-performance automatic semantic role labeling systems. Despite the size of these corpora, individual verbs (or role-sets) often have only a handful of instances in these corpora, and only a fraction of English verbs have even a single annotation. In this paper, we describe an approach for dealing with this sparse data problem, enabling accurate semantic role labeling for novel verbs (rolesets) with only a single training example. Our approach involves the identification of syntactically similar verbs found in PropBank, the alignment of arguments in their corresponding rolesets, and the use of their corresponding annotations in PropBank as surrogate training data.
Lane, H.
Conference Proceeding
(Marina Del Rey, CA, July 2007)
Read Abstract » | Read More »
We argue that metacognition is a critical component in the development of intercultural competence by highlighting the importance of supporting a learner’s self-assessment, self-monitoring, predictive, planning and reflection skills. We also survey several modern immersive cultural learning environments and discuss the role intelligent tutoring and experience management techniques can play to support these metacognitive demands. Techniques for adapting the behaviors of virtual humans to promote cultural learning are discussed, as well as explicit approaches to feedback. We conclude with several suggestions for future research, including the use of existing intercultural development metrics for evaluating learning in immersive environments and to conduct more studies of the use of implicit and explicit feedback to guide learning and establish optimal conditions for acquiring intercultural competence.
Tortell, R., Luigi, D., Dozois, A., Bouchard, S., Morie, J., Ilan, D.
Virtual Reality
(London, 2007)
Read Abstract »
Scent has been well documented as having significant effects on emotion (Alaoui-Ismaili in Physiol Behav 62(4):713?720, 1997; Herz et al. in Motiv Emot 28(4):363?383, 2004), learning (Smith et al. in Percept Mot Skills 74(2):339?343, 1992; Morgan in Percept Mot Skills 83(3)(2):1227?1234, 1996), memory (Herz in Am J Psychol 110(4):489?505, 1997) and task performance (Barker et al. in Percept Mot Skills 97(3)(1):1007?1010, 2003). This paper describes an experiment in which environmentally appropriate scent was presented as an additional sensory modality consistent with other aspects of a virtual environment called DarkCon. Subjects? game play habits were recorded as an additional factor for analysis. Subjects were randomly assigned to receive scent during the VE, and/or afterward during a task of recall of the environment. It was hypothesized that scent presentation during the VE would significantly improve recall, and that subjects who were presented with scent during the recall task, in addition to experiencing the scented VE, would perform the best on the recall task. Skin-conductance was a significant predictor of recall, over and above experimental groups. Finally, it was hypothesized that subjects? game play habits would affect both their behavior in and recall of the environment. Results are encouraging to the use of scent in virtual environments, and directions for future research are discussed.
Buckwalter, J.G., Geiger, A.M., Parson, T.D., Handler, J., Howes, M., & Lehmer, R.R.
The International Journal of Neuroscience, 117, 1579-1590
(2007)
Rendering for an Interactive 360 Degree Light Field Display
Jones, A., Bolas, M., McDowall, I., Yamada, H., & Debevec, P.
SIGGRAPH 2007
(San Diego, CA, August 2007)
Read Abstract »
We describe a set of rendering techniques for an autostereoscopic light field display able to present interactive 3D graphics to multiple simultaneous viewers 360 degrees around the display. The display consists of a high-speed video projector, a spinning mirror covered by a holographic diffuser, and FPGA circuitry to decode specially rendered DVI video signals. The display uses a standard programmable graphics card to render over 5,000 images per second of interactive 3D graphics, projecting 360-degree views with 1.25 degree separation up to 20 updates per second. We describe the system’s projection geometry and its calibration process, and we present a multiple-center-of-projection rendering technique for creating perspective-correct images from arbitrary viewpoints around the display. Our projection technique allows correct vertical perspective and parallax to be rendered for any height and distance when these parameters are known, and we demonstrate this effect with interactive raster graphics using a tracking system to measure the viewer’s height and distance. We further apply our projection technique to the display of photographed light fields with accurate horizontal and vertical parallax. We conclude with a discussion of the display’s visual accommodation performance and discuss techniques for displaying color imagery.