Role Play Dialogue Agents

Published: April 18, 2024
Category: Essays | News
David Traum

By Dr. David Traum, Research Professor of Computer Science, Director for Natural Language Research, ICT

Dr. David Traum is Director for Natural Language Research at ICT. In this essay he examines the linguistic interplay between humans and machines, drawing on historical antecedents, such as the Turing Test, tracing a line through intelligent assistants and open-domain conversational partners – to the role play dialogue agents currently in use in his own laboratory.

How should we talk with computers? This question predates any actual artificial intelligence applications, but still didn’t have a definitive answer at the beginning of the ICT or even today, 25 years later. Some follow Alan Turing’s famous “test”, that we could attribute intelligence to machines, only if they were able to talk as much like people as men and women talk like each other. Certainly there would be many benefits of “getting computers to talk like you and me”, especially in that we wouldn’t have to learn new ways of communication to talk to or understand machines. 

On the other hand, more recently many have pointed out ethical issues with computers pretending to be real people, or talking in a way that would make it difficult for people to know if they were talking to a person or a machine. Those favoring this approach point out the differences between humans and (current and possibly future) machines and posit that this means we should also talk differently with computers than the way we do with each other, because computers are very different from people.

When we talk to each other, much of our communication depends on the activities we are engaged in and the roles we are playing in that activity. This can include the words we use, the frequencies of words, the tone of voice and accompanying non-verbal behaviors used, but also the kinds of meanings and responses expected even for the same words uttered. For example, a question might get very different responses if coming from a teammate in the midst of a collaborative task, or from a stranger at the bus stop, or from a classroom instructor, or in a courtroom from an attorney. While a simple correct answer might suffice in all cases, the situations  differ in terms of acceptability of refusing to answer, giving an incorrect answer, pleading ignorance, or providing a helpful way of finding out the information. But what are the appropriate “roles” for the activities of human-computer dialogue? How are these similar or different from human roles such as the above examples in related activities between humans?

Most computer dialogue systems fall into two broad categories: intelligent assistants or open-domain conversational partners. Assistants are meant to help a person achieve some task. They are constructed to interact with users to first determine the user’s intent with respect to the details of the task (e.g, providing information, providing instructions, booking a ticket, or setting an alarm) and then carry out the task and report back. Thus the focus can be on the task itself, rather than the identities and relationship of the participants, similar to one-time service encounters at a fast food restaurant or customer help desk. 

Open domain systems are more inspired by the Turing test and require the computer to talk intelligently about any possible subject the user wants to bring up (though it’s ok to say they don’t know much about that topic). There have been numerous competitions, such as the Loebner prize and Amazon Alexa challenge that have assessed efforts in this area. Often the goals have switched from trying to be indistinguishable from humans to capturing the interest of conversational participants such that they continue the dialogue for a long amount of time or are rated as highly enjoyable by the participants. Such systems don’t really have a task (other than engaging in conversation), and frequent topics of conversation include aspects of identity, group affiliation, preferences and feelings. It is thus important to establish what the identity of the machine participant actually is – is it pretending to be human? Is it a completely alien, un-human-like machine identity? Does it have some commonalities with human identity, or does it refuse to accept the concept of machine identity? Answers to these questions will impact how people choose to interact with and how they feel about the interaction and the machine.

A third type of system is role-play dialogue agent, much less common in the general dialogue systems community than assistants and open domain chatbots, but central to ICT’s purpose of engaging training and learning environments. These systems play a role in an activity that would typically be done by a person. Having machines that can play these roles allows a user to practice social activities that would normally require multiple people, even if fewer or no other people are available, as conceived in Star Trek’s “Holodeck”. These systems enable different kinds of experiential learning and practice, to sustain and improve skills, by providing role-appropriate circumstances to react to. Role-play dialogue agents are similar to assistant dialogue systems in that they are concerned with one or more tasks, however they can be different in that the system’s role is not always strictly subordinate to the user, as an assistant system would be. Thus understanding the user’s intent is not always enough for the system to carry out a task, sometimes the system must also decide to agree to the user’s request. This may sometimes require a further process of negotiation or argumentation or explanation on the part of the user, just as they would have to do with a person in that activity. Sometimes dialogue agents play the role of opponents who desire not to help the user, but actually to hinder them, and sometimes part of the activity is for the user to determine where the agent’s loyalties lie, and perhaps how to align them. Role-play agents are also similar to open-domain dialogue systems, in that they should respond (in character) to whatever a user says, and their (role-play) identity is often a topic of conversation or structures how they should interact. The difference is the focus on a specific activity and goals of the agent. It is also more clear that it is acceptable for the systems to act “in character” as a human, similar to the way an actor would take on an identity different from their own, for educational or entertainment purposes.

Role play dialogue systems have been central to ICT from the beginning of the institute, with our flagship interdisciplinary “Mission Rehearsal Exercise Project,” which involved a human user/training, leading a platoon full of role-players, as well as a potentially angry crowd and others both on and off the user/trainee’s team. 

I was recruited to  ICT in the year 2000, initially by Jeff Rickel from ISI, who realized that the kind of agents he wanted to create for this project needed an ability to reason and talk about tasks and situations beyond just filling in “slots” in a frame, which was the current state of the art. I had previously worked in task-oriented dialogue systems, though looking slightly beyond the assistant role in some of them, considering other kinds of collaboration. It was exciting to come west and work in an environment that considered not just different roles for dialogue agents, but also visual embodiment, immersive display in a virtual reality theater, the role of emotion, pedagogy and cognitive architecture in a unified system that also represented a collaboration with creative professionals from the entertainment industry in support of military doctrine and training. When confronted with problems of what a dialogue agent should do in a particular situation, it was an amazing experience to be able to go beyond introspection and individual research to directly talking with some of the world experts from a variety of points of view – Military Subject Matter Experts, Hollywood writers and other creative professionals, as well as scientists studying linguistics, communication, psychology, pedagogy, and graphics, and sound.. This collaboration helped us focus on new notions of dialogue “success” – not (just) accuracy numbers, but on the impactfulness of the experience.

Our natural language dialogue group has contributed to many ICT projects involving role-play dialogue agents, for a large variety of roles of activities, including teammates, several scenarios within SASO, involving non-team interaction, authoring tools and many scenarios (some written by interns and West Point Cadets) for Tactical Questioning training, virtual patients, interviewers and interviewees, game partners and opponents, wild-west gunslingers and bartenders, counselors, language teachers and practice partners, and many others. It’s been amazing to see many of these projects leave the lab and show up in museums, at Army bases, and elsewhere, where they interact with real users who interact with the characters because they want to, not because someone’s paying them to do it.

So, how should we talk to computers? Ideally, any way we want to, and they will talk to us in ways that are appropriate for the specific circumstances, such as playing a role, where that’s what we want, while keeping ethical issues at a forefront, so roleplay can be easily distinguished from reality where important.

//

BIO:

David Traum is the Director for Natural Language Research at the Institute for Creative Technologies (ICT) and Research Professor in the Department of Computer Science at the University of Southern California (USC).  He leads the Natural Language Dialogue Group at ICT. More information about the group can be found here: http://nld.ict.usc.edu/group/  Traum’s research focuses on Dialogue Communication between Human and Artificial Agents.  He has engaged in theoretical, implementational and empirical approaches to the problem, studying human-human natural language and multi-modal dialogue, as well as building a number of dialogue systems to communicate with human users. Traum has authored over 300 refereed technical articles, is a founding editor of the Journal Dialogue and Discourse, has chaired and served on many conference program committees, and is a past President of SIGDIAL, the international special interest group in discourse and dialogue.  Traum earned his Ph.D. in Computer Science at the University of Rochester in 1994.