By Dr. William Swartout, Chief Science Officer, USC Institute for Creative Technologies; co-Director, Center for Generative AI and Society; research professor, USC Viterbi School of Engineering
Dr. William (Bill) Swartout has been involved in cutting edge research and development of artificial intelligence systems throughout his career. He is an AAAI Fellow, and received the Robert Engelmore Award from the Association for the Advancement of Artificial Intelligence (AAAI) for seminal contributions to knowledge-based systems and explanation, groundbreaking research on virtual human technologies and their applications, and outstanding service to the artificial intelligence community. In this essay for ICT’s 25th anniversary, Dr. Swartout looks back at his career in AI and brings us right up to the present day on ICT’s AI endeavors.
Introduction
When I was in grade school, I read Danny Dunn and the Homework Machine, about a young boy, whose mother is the housekeeper for a prestigious professor. The book was prescient, as science fiction often is, because the professor had access to a very high-end computer, and young Danny and his friends read their textbooks into the machine, only to discover it could do their homework for them!
Of course they get caught (in the end), but the book anticipated the current revolution in generative AI by about 60 years – and absolutely hooked me on the concept of artificial intelligence. The idea that a computer could intelligently produce natural language text that people could read, or that you could have a conversation with a computer was, and remains, fascinating to me, and led me to pursue artificial intelligence throughout my career.
In high school, I joined a computer club, and my friends and I worked on a program to generate modern poetry. The program worked by selecting words randomly from lists and plugging them into a template that indicated what part of speech went where, which assured that the output was at least somewhat grammatical. The results can most charitably be described as “very modern.”
Much of the output didn’t make sense, but there was the occasional phrase or stanza that seemed understandable, and presented things in a novel way. Two things I learned from that were that computers can be really good at generating unanticipated things, and that people can perceive meaning in something, even when none is intended.
“Serious AI” at Stanford
As an undergrad at Stanford, I got my first exposure to “serious AI” working at the Institute for Mathematical Studies in the Social Sciences, with Avron Barr, Marian Beard, and Richard Atkinson on BIP, a computer-aided instructional program to teach students how to program in BASIC.
I worked on two modules in that system, first, a solution checker that determined whether or not a student’s program was working correctly, and second, FLOW, a module that dynamically animated the execution of a student’s program on a video terminal, showing them the control flow and how variables changed as the program executed – a not insignificant accomplishment given the capabilities of the terminals at that time.
At that time, I also worked at the Stanford AI Lab with Cordell Green, who was a professor in computer science, and David Shaw, a CS graduate student, on a program to synthesize LISP functions from example input-output pairs. Given an example input (A B C D) and an output (D D C C B B A A) the program would create a LISP function to reverse and double a list for arbitrary lists. This work resulted in my first professional publication at the 1975 International Joint Conference on Artificial Intelligence.
Explainable AI at MIT
Later, as a graduate student at MIT, while working in the Clinical Decision Making Group headed by my advisor, Peter Szolovits, I returned to my interest in having computers generate texts that people could read, on a project to assist doctors in diagnosing patients and prescribing drugs correctly.
One of the graduate students, Howie Silverman, had written a program called the “Digitalis Therapy Advisor” that could administer the heart drug digitalis to patients at the level of an expert cardiologist. However, the program lacked an ability to explain how it produced its recommendations, and we had found that without an explanatory capability physicians were unlikely to accept a program’s advice.
For my masters thesis, my goal was to give the program an explanatory capability, and, to make things more challenging, the explanations should be produced not by just printing out documentation associated with the program, but by translating the program code into written English. If the code changed, the explanations would change as well.
To accomplish this, I reprogrammed the Digitalis Therapy Advisor into OWL, a programming language developed at MIT by William Martin and his colleagues. In this language, the names of procedures and variables weren’t just strings of text as they are in most programming languages, instead they were structured semantic objects that could be translated into English. Functions were typically named by a structured object that corresponded to a verb phrase that said what the function did, and variables were typically named by noun phrases that described the variable.
This approach gave me the additional information needed for explanations, and the Digitalis Therapy Advisor became one of the first expert systems that could explain, in English, the steps it took to arrive at a particular therapy recommendation.
For my PhD, I took a critical look at the work I had done on my masters, and realized that while it could explain what the program did, it couldn’t explain why it did it.
For example, the digitalis advisor might have a rule such as:
If the patient’s serum calcium is greater than 10, reduce the dose of digitalis by 0.75.
A rule like that would perform correctly, but there’s no hint in the explanation about why that is the right thing to do. In this case, the causal justification is that increased serum calcium increases the risk of arrhythmias, which also can occur with digitalis administration. From an explanation standpoint, we couldn’t give justifications because the knowledge needed to support that was known by the person who programmed the system, but it didn’t need to be in the code for it to perform correctly.
To capture that knowledge so it could be used in explanations, I began by representing causal knowledge in the domain, such as what digitalis causes and how it interacts with other physiological parameters. I then represented abstractly the general problem-solving strategies the program used, such as:
If a drug causes something dangerous that is also caused by a possible patient finding, check for that finding and if it is present, reduce the dose.
I then built an automatic programmer to create the expert system using the general domain knowledge and problem-solving strategies.
Along the way, the automatic programmer recorded the reasoning it did and left behind a trace. During runtime, these “mental breadcrumbs” allowed explanation routines to access the underlying reasoning behind the running code, so that when asked to justify why the program was asking about the patient’s serum potassium, this is an actual justification it could produce:
The system is anticipating digitalis toxicity. Decreased serum potassium causes increased automaticity, which may cause a change to ventricular fibrillation. Increased digitalis also causes increased automaticity. Thus, if the the system observes decreased serum potassium, it reduces the dose of digitalis due to decreased serum potassium.
Natural Language Generation at USC Information Sciences Institute
In 1981, I graduated from MIT and decided that six years of Boston-in-winter had been more than enough. I was eager to return to the west coast. I was excited when I received a job offer from the USC Information Sciences Institute, from Bob Balzer and Bill Mark, because it gave me an opportunity to continue to work on automatic programming and explanation.
In the Explainable Expert Systems (EES) project, working with Bob Neches, Cecile Paris, Yolanda Gil and my graduate student, Johanna Moore, we extended the work I had started in my PhD thesis and began to explore how we could improve the automated explanations by providing clarifying explanations if the first one wasn’t understood.
At the time, a lot of the work in natural language generation had focused on better user models. The idea was that if one knew what the user knew (and didn’t know) you could plan out the perfect explanation that addressed their needs without telling them things they already knew.
At ISI, we felt that the user modeling approach was not likely to work out. Part of the problem was that acquiring a good model of the user was very difficult. The other problem was that this approach ignored a valuable source of information, namely feedback from the user as the explanation is being given.
For human-to-human interaction, this feedback is much more important than a model of the other person. People feel reasonably comfortable talking with someone they don’t know well (have no model of), but rapidly become uncomfortable if the person is a “great stone face” i.e. someone who doesn’t react, with both verbal and non-verbal communication, while the other person is talking.
To be able to justify what an expert system was doing, we captured the design of that system. Our idea for supporting feedback and clarifying explanations was to capture the design of the explanation, that is, to know what the communicative goals were of the explanation and what strategies were used to convey that information. If a user didn’t understand an explanation, the system looked back in the text plan, saw what communicative goal the misunderstood explanation was trying to achieve, and looked for alternative strategies for conveying the same information.
For example, if an abstract description of some principle wasn’t understood, the explanation system might recover by providing an example of that principle in use.
In 1989, I became the Director of the Intelligent Systems Division at ISI, and, although I remained active in research, my major focus during this time became building up the AI capabilities at ISI both in size and stature. As a result, the division grew from about 25 people in 1989 to around 90 when I left a decade later.
Also, while I was running the division, 5 of our researchers were elected as AAAI Fellows, and an additional 5 were elected later. I am very proud of what we accomplished during that period of my career.
The UARC Concept
In the late 1990’s, computer games started to become quite good. Characters in games were no longer highly pixelated, and began to look a lot more like real people. The terrain also evolved to be more realistic, and the behaviors improved to the point where substantially more sophisticated scenarios could be supported.
The Department of Defense noted that, in some ways, video games were becoming more engaging and realistic than the simulations that the military used, and asked whether something could be gained from a closer connection between the military and the entertainment industry.
In 1996, a workshop was held by the National Research Council to examine that question. Representatives from the military, the entertainment industry and academia all attended. That group wrote a report that affirmed the potential value of a closer collaboration.
The next question was how to make all this happen.
Early efforts to fund Hollywood studios directly, to work on problems of interest to the military produced some interesting results, but it was felt that to really see the value, something more enduring was needed.
The DOD Director, Defense Research and Engineering, Anita Jones, and the Army’s Chief Scientist, Mike Andrews, decided to form a University Affiliated Research Center: an institute within a university where people from the very different cultures of the military, entertainment and academia could meet, build trust and, vitally, collaborate on projects.
USC was one of the universities that was considered as a possible site for this new institute. Paul Rosenbloom, one of my colleagues at ISI, was given the task of coordinating USC’s response and writing the proposal. I worked with him on a section of the proposal that outlined how AI-based virtual humans could be used in a simulation to teach soldiers how to confront the kinds of dilemmas they might encounter during deployment. The proposal was submitted, and USC was selected.
A few days later, Paul talked with me about the next steps. He said that rather than joining the new institute, he preferred to stay at ISI (for the time being) but wondered if I might be interested in joining it. I jumped at the chance. In high school, I had been very interested in film-making, and even considered it as a possible career choice for a while, so I saw this new institute as a chance to bring together my interests in film, the creative arts, and artificial intelligence – fields that in high school had seemed completely disjointed, but perhaps now were coming together.
On August 18, 1999, USC hosted an event to commemorate the signing of the contract for the new Institute for Creative Technologies. The celebration was notably grand, reflecting that this was the largest contract that had ever been awarded to USC, to date. The president of USC, Stephen Sample; the Secretary of the Army, Louis Caldera; and Jack Valenti, president of the Motion Picture Association of America all spoke. Even Gray Davis, governor of California, delivered a speech via video link.
But after the celebration was over, a lot of work remained to make ICT a reality, including recruiting staff, leasing a building, installing high end equipment for computing, displays, and audio, developing research tracks with strong outcomes which would convince the Army that their money was being well spent, and working towards an official launch.
ICT Official Opening
It was decided that Tuesday, September 26, 2000 would be our official opening. On reflection, almost too ambitious an undertaking, but the date was set.
We had a small but highly motivated and capable team right at the beginning. Richard Lindheim, formerly EVP, Television at Paramount was hired to be our first Executive Director. Jim Korris, an experienced Hollywood writer/producer with several major credits to his name, became our Creative Director, Cheryl Birch came from Paramount to be our CFO, and I rounded out the senior leadership team as Director of Technology, responsible for the Institute’s basic and applied research projects.
We began to build out the research staff. Paul Debevec, a rising star in computer graphics, joined us from UC Berkeley along with Tim Hawkins, Christopher Tchou, and Jonathan Cohen, forming the basis of our Graphics Lab. Randy Hill (now our current Executive Director at ICT) and Jon Gratch, who emerged as a leader in affective computing, joined us from ISI. Jacki Ford Morie, former Head of Computer Technical and Artistic Training, Disney, came onboard as consultant, before becoming Senior Research Scientist, and David Traum joined to head our Natural Language group.
Because it takes a while to build out a research staff, we began our research projects by collaborating with researchers who were then at ISI. For artificial intelligence, this included a number of people from the Intelligent Systems Division at ISI including Lewis Johnson, Stacy Marsella, Jeff Rickel, Marcus Thiebaux and Richard Whitney.
Mission Rehearsal Exercise
Shortly after the signing ceremony, the Army told us that they wanted us to build the Holodeck from Star Trek, i.e. a highly immersive simulation space that could provide 3D environments, including sights and sounds, where one could interact with virtual humans that were autonomous, AI-enabled, computer-generated characters that looked and behaved like real people as much as possible.
In response, we created our first showcase project, the Mission Rehearsal Exercise (MRE), a leap-forward prototype that showed what could be done using the very latest technology (in the year 2000) with sound, vision, 3D immersion, graphics and virtual humans.
The television writer Larry Tuch (Quincy M.E.) created the scenario for MRE, notably different from almost all military simulations in that no weapons are used. Instead, the trainee, a young lieutenant, is confronted with a challenging dilemma. The simulation is set in an urban landscape in Bosnia. A lieutenant and his platoon are heading to reinforce another unit which is dealing with local unrest.
The simulation opens with the lieutenant arriving at the scene of an accident. One of his Humvees has collided with a civilian car. There is a small child, seriously injured on the ground, and a frantic mother. A crowd starts to form.
What should the lieutenant do? Continue on with the mission, or stop and render aid?
In this scenario, the townspeople, the soldiers, the mother and her child were all virtual humans. From an AI standpoint the most sophisticated virtual human was the platoon sergeant. The trainee lieutenant interacted with the sergeant in English, and the sergeant carried out his commands and also provided advice and recommendations to the lieutenant, much as happens in real life.
To create a highly immersive simulation, we envisioned that it would be presented on a large, curved screen. We had seen such a system at the I/ITSEC conference in November 1999 and wanted to duplicate it at the ICT. As it was realized, the trainee stood in front of a curved screen that was 8.75 feet tall and 31.3 feet wide illuminated by three BARCO projectors that were edge-blended so that the image appeared seamlessly. An SGI Onyx Reality Monster provided the computational resources to drive the experience. Immersive audio was provided by a 10.2 audio system custom developed by Prof. Chris Kyriakakis of the USC Electrical and Computer Engineering – Systems department at USC.
All of the systems were either custom systems or very early in their life cycle and thus not exactly tried and tested. For example, although the projectors were commercial products, the ones delivered to us had very low serial numbers – as I recall we got serial numbers 2, 3, and 4. To compound all that, the building was being built out at the same time that the equipment was being delivered. When the computer was delivered, we had to begin using it right away since it was critical to developing the software for MRE, but the computer room hadn’t been built yet. We wound up putting it in a temporary room with a temporary air conditioner that had air ducts snaking all around. Everyone developed their software on long tables clustered around the computer.
Our entertainment connection did help us out with a few things, however. Due to his connections with Hollywood, Richand Lindheim was able to engage Herman Zimmermann, the set designer for Star Trek to design the interior space for the ICT, and to have it fabricated by Paramount set constructors. This significantly sped up the construction process. About a month or so before the big opening, a group from the Army showed up to make sure things were going to plan. When they toured the under-construction facility, they were aghast. There was wallboard here and there, uncovered ducting and bare floors.
“They’ll never be ready in time,” they said.
What they didn’t realize was that the entire interior had already been constructed and was sitting, in modular form, in a warehouse at Paramount. One day, several large trucks pulled up, unloaded the interior modules, and in short order the interior was complete.
In building the software to support the Mission Rehearsal Exercise simulation, we were running into a number of issues. Due to the limited time available, there were a few things we left out of the initial demo that we added the next year. This included natural language understanding where we relied instead on an operator to recognize what was said.
Additionally, to take advantage of the latest capabilities available, we decided it made sense to integrate several commercial solutions into the AI software that we were developing. We used a package called Vega to render the environment and special effects, while PeopleShop from Boston Dynamics was used to create and animate the virtual human bodies. We used another package, VirtualFriend, from another vendor to give the PeopleShop characters expressive faces. All of this integration came with its own cost, and about a week before the grand opening, we still had not been able to run all the way through our demo without having the system crash at some point.
We eventually traced a major part of the problem to memory leaks, and Marcus Thiebaux ported our code back to ISI where they had some tools for finding memory leaks. Working with the commercial vendors, Marcus pulled an all-nighter getting out the leaks. Finally, on the Saturday before the Tuesday opening, we got the system to run through the scenario in its entirety. I went home and slept very soundly.
However, the next morning I was driving into the Institute and I got a call from Jon Gratch.
“Bill,” he said, “there was a power glitch last night and some of the systems didn’t come back up properly.”
When I got to the office, things got worse. Ben Moore, who was our main integrating programmer, called in from the hospital saying he had a severe sprain that needed to be taken care of. Just when it seemed things couldn’t get much worse, nineteenth century technology conspired against us and the elevators went down. Ben came in on crutches in the early afternoon. Eventually we got most everything to come back up, but the fiber-optic controller that connected the Reality Monster computer on an upper floor with the displays on the ground floor was asking for a password that we didn’t know. The installers had installed it without telling us what the password was and, thinking their job was done, had flown back to the Midwest! Getting in touch with the installers on a Sunday afternoon in the Midwest proved to be a challenge, but we eventually got the password and the system was operational!
On Tuesday, at the grand opening, people were queued up to see the Mission Rehearsal Exercise. The experience was impressive. The wraparound screen enveloped and immersed the viewers, the 10.2 sound system not only surrounded the viewer with sound but gave a vertical dimension as well, so that when a helicopter flew overhead during the demo, it sounded like it was churning its way through the ceiling tiles. And the virtual humans showed where AI technology could go and support training in innovative ways.
We gave the demo dozens of times, and as I recall there was only one crash. We cut this one very close, but we did bring it off with much acclaim. The MRE system went on to win a first place award for a software prototype at the International Conference on Autonomous Agents in 2001, and ICT won the Defense Modeling and Simulation Office/National Training Systems Association Modeling and Simulation Outstanding Achievement Award, largely due to the success of MRE.
Increasing Sophistication of Virtual Humans
We continued to construct these large scale simulations for several more years, emphasizing increasingly sophisticated interactions. In SASO-ST, a trainee had to negotiate with a humanitarian aid doctor about moving the location of his clinic. The character could use several different negotiation strategies in dealing with the request, such as trying to avoid the negotiation, viewing it as a win-lose situation and trying to win, or looking for a potential win-win approach. The approach taken depended on a number of factors including how the doctor viewed the trainee, the trainee’s actions and how they framed the request, as well as the overall context.
SASO-EN continued the negotiations theme, but expanded the context so that the trainee had to interact with both the doctor and a town elder about the location of the clinic. If the trainee was skillful they might use the town elder to help convince the doctor, if not, they may gang up on them. In this system the negotiations could also be influenced by external events, such as an increasing level of unrest in the vicinity.
There were several insights that I gathered from building virtual humans in training scenarios. First a virtual human is perhaps the ultimate test of an integrated AI capability. That is because to be realistic and believable, the virtual human must support so many of those aspects that we expect in a real person. Unlike the Turing test, where one interacts with an AI over a teletype, interaction with a virtual human, just like a real human, is much more.
There is verbal interaction, but equally important there is non-verbal communication: the gestures, facial expressions, posture and tone of voice all convey a lot of information that people are very facile with. Emotions are very much part of being human, but emotion modeling has not been a big concern of traditional AI. Yet, once a virtual human appears on a screen, people will ascribe emotions to it, whether they are intended or not. Thus we were required to build executable models of emotions for our virtual humans. Jon Gratch and Stacy Marsella did groundbreaking research in building emotion models and their resulting EMA emotion modeling framework is a classic.
Another realization was the value of story. Story not only engages and immerses the user, but it also makes building virtual humans feasible. Story creates a strong context for people, and that context limits what they are likely to say. When the lieutenant trainee was in the MRE scenario in Bosnia dealing with the injured child he is very focused, which not only limits what he will talk about, but also limits what responses the AI needs to support. At the time, if we had tried to put virtual humans into the world in a general sense, it would have been too hard – there would have been too many situations to deal with. By embedding virtual humans in the context of a story, implementation became much more feasible.
Stories can have another impact. A good story can create a rich social environment that raises new research issues. Most work in natural language processing had assumed that the interaction was one-on-one between a computer and a person, and that it was a cooperative conversation. In some of our negotiation scenarios, a trainee would interact with multiple virtual humans simultaneously, which raises issues of how to understand when a conversation with one virtual human has stopped and another starts. A cooperative conversation assumes that people will stay on topic. But one of the negotiation strategies is to avoid the negotiation by changing the topic. A cooperative conversation also assumes shared goals, which may not be the case at all in a negotiation.
Virtual Humans Move to the Museum
Initially, we thought that virtual humans would primarily act as role players in training simulations, but as we gained experience in developing and using them, we realized that they can be used in other ways as well. Virtual humans could act as coaches or mentors, they could answer questions about themselves or other topics.
Our first question-answering character was Sgt. Blackwell. He was a character with a bit of an attitude who could answer questions about himself, the ICT, and a few other topics. His responses were fixed and pre-recorded in advance. Users would ask questions in English and we used speech recognition to get the words in the question which were sent to a classifier that had been trained on a set of question-answer pairs.
The classifier would find the best response to the question or indicate that none was available. That response was then played for the user. If no response was found, the system would give what we called an “off topic” response, which was either a request for the user to rephrase or re-ask their question, or a suggestion to move to a different topic. Sgt. Blackwell was installed in the Cooper-Hewitt National Design Museum in New York from December 2006 to July 2007 as part of the National Design Triennial.
We built several interactive question-answering systems like Blackwell. With NSF funding we created a pair of virtual museum guides for the Boston Museum of Science. Ada and Grace were identical twins (named after computer science pioneers, Ada Lovelace and Admiral Grace Murray Hopper) that inhabited the computer area of the museum.
They had a significantly more extensive range of responses than Blackwell and could answer questions about what visitors could see in the museum, their background, computers, and the technology that made them work. A companion exhibit dynamically showed visitors the science behind how the characters worked. The exhibit predated the Apple purchase and roll-out of Siri (originally developed at the Stanford Research Institute), so for visitors seeing a system that could understand what you said and respond reasonably was quite impressive. We estimated that approximately 250,000 visitors encountered Ada and Grace during their time in the museum.
New Dimensions in Testimony
New Dimensions in Testimony is perhaps one of the best uses of the question-answering technology, and one of the systems I’m proudest of arose from a collaboration we did with the USC Shoah Foundation. The Foundation was started by Steven Spielberg and over the years has recorded many interviews with Holocaust survivors. A little over ten years ago people from the Foundation came to visit the ICT. They said that for many visitors to a Holocaust museum, it was the opportunity to interact with a real Holocaust survivor that was most meaningful to them. The ability to ask them questions about what happened, how they and their families were affected, and the effect the Holocaust had on their attitudes toward life, religion, and the people who had oppressed them. The problem was that all of the survivors were getting a lot older, which meant we will be the last generation able to have that sort of first-hand experience.
The question on the minds of the Shoah personnel was could there be some way to use technology to preserve the ability of future generations to have an interactive conversation with a Holocaust survivor?
To answer that question, we brought about a dozen Holocaust survivors in and videotaped them using multiple high-resolution cameras as they answered a wide ranging set of questions about their life before, during and after the war, and how their experiences had shaped them and affected their attitudes. For some of the survivors we recorded as many as about 1500 responses. We then used our question-answering technology to create a system that could respond to visitors’ questions by playing back the video responses that had been recorded. We called the system New Dimensions in Testimony (NDT).
We found that a couple of things worked together to make the NDT experience especially compelling. First by recording as many responses as we did it meant that as long as a visitor stayed on topic, asking questions about the survivor’s experiences before, during and after the war, and their attitudes, it was very likely that we had an appropriate answer in our response database. That meant that users rarely got an off-topic response, which preserved the feeling of having a conversation with a person. While people might have initially been impressed by the AI driving the experience after a few interactions the AI essentially disappeared because it worked as well as it did, allowing visitors to focus on the substance of the responses. Second, many of the survivors were excellent story tellers with compelling stories to tell.
These two elements worked together to make NDT the most moving experience I’ve ever worked on and the only one that would regularly move people to tears.
Personal Assistant for Lifelong Learning (PAL3)
In 2015, I returned to one of my earliest interests: using computers in education. We began working on PAL3, a personal assistant for lifelong learning. The long-term goal was to create an agent that could accompany a learner throughout their career. The system would know the learner’s background, such as what courses they had studied, how they did, and what their mastery was of various topics. It would also know what their goals were, and what was needed to achieve them, and it would have access to a set of learning resources that it could recommend to the learner adaptively based on their background, performance, and learning goals. The initial version of PAL3 ran on a Microsoft Surface and was designed to teach sailors about basic electronics. Although almost all training in the military is mandatory, we wanted to make PAL3 so engaging that sailors would use it voluntarily. To achieve that we used what we had learned from working with the entertainment industry. PAL3 had an engaging animated character, a high-tech talking drone, that interacted with learners, encouraging them when they reached achievements and acting as a guide to the content. Points were given for using the system and these could be traded in for various enhancements, such as changing the paint scheme on one’s drone. We adopted an open learner model, so learners could see at any time how they were doing and what they had achieved, and how they compared to the class average.
The Navy was interested in reducing the knowledge decay that occurred when sailors graduated from one school but then had to wait for extended periods before entering the next school. In a controlled study we showed that PAL3 could eliminate, in aggregate, that knowledge decay, even if it was used on a voluntary basis. Later, we ported PAL3 to run on smartphones which made it even more available to learners and used it in new and widely varied content areas, such as leadership training and even suicide prevention training. We are now using it to teach AI skills as a critical part of the new AI Research Center of Excellence for Education, described next.
AI Research Center of Excellence for Education (AIRCOEE)
The recent revolution in AI has already had a huge impact on society and promises even more. AI systems are much more robust and capable. Now, knowledge of AI and machine learning is widely seen as key to staying ahead, both individually and as a nation, of our near-peer competitors. Yet, the vast majority of people in the United States have at best only a passing knowledge of AI. Furthermore, rapid technological advancements will make many jobs obsolete, and although new jobs will be created, people lack the requisite skills for the jobs of the future. Compounding these problems, AI technologies are advancing at an accelerating rate, outpacing the ability of current educational centers to keep up.
Perceiving that need, we proposed a new AI Research Center of Excellence for Education. The Center was funded in the summer of 2023, and its major goal is to use AI to teach AI, that is, to develop AI-based educational tools and create AI content that is accessible. Our goal with the Center is to increase AI literacy broadly, so that while people educated by the program might not be ready to become proficient programmers, they will have a solid understanding of AI: what it is, what it can be used for, where its use may not be appropriate, and what ethical considerations need to be taken into account when it is used.
USC Center for Generative AI and Society
Open AI’s ChatGPT was released on November 30, 2022. Now, about 18 months later it is clear that generative AI systems like ChatGPT are having a profound disruptive effect on education.
The “homework machine” I’d read about in my youth seemed to be a reality, and students had at their disposal a set of systems that could complete writing assignments, generate artwork, and even solve programming problems.
People became alarmed about the effects generative AI would have on society. In March 2023, USC established the Center for Generative AI and Society with the goal of better understanding the effects of generative AI and how it can be used for good. Alongside my role at ICT, I also serve as Co-Director, Education Branch, for the Center for GenAI, where we are focused on two areas that are heavily impacted by generative AI: education and media/creative industries. The Education Branch is a collaboration involving Ben Nye and Aaron Shiel at the ICT with Gale Sinatra, Stephen Aguilar and Changzhao Wang at the Rossier School of Education.
Working with faculty from the USC Writing Program, we have created a writing tool that uses generative AI. But our approach is entirely different from the vast majority of AI writing tools and supports a fundamentally different approach to writing education. Rather than writing text for a user, our tool uses generative AI to help students become better writers themselves, by helping them brainstorm ideas, improving their critical thinking skills and using AI to critique their work.
Also unlike other tools, our framework captures a student’s writing process at a fine-grained level, which will ultimately allow instructors to evaluate student work based on the process they go through rather than the final artifact (essay) they produce. Grading the process rather than the artifact makes cheating much more difficult and will give instructors better insights into how their students approach writing.
Final thoughts
It’s very satisfying to me to see that some of the science fiction that inspired me in my youth is now becoming a reality. While I don’t think that the doomsday predictions that some have offered are likely to come to pass, like any technology, AI can be used for good or for bad.
Because AI is so powerful, the imperative is even stronger to carefully consider the effects of how we use AI.
I believe there are three elements in how we need to approach AI intelligently.
The first is education. As I’ve outlined above, as a society we need to become more AI literate broadly so that we understand the implications of this new technology.
The second is regulation. Part of that means having appropriate laws in place so that people who use AI for nefarious purposes, such as to create a malicious deepfake, can be held to account. The other part of regulation is certification. Technologies such as aviation, automobiles, communications and pharmaceuticals are regulated by the government and must meet certain standards to ensure that the public is not put at untoward risk.
The third is technology research. Generally speaking, research makes technologies safer as time goes on by addressing shortcomings of early versions. As an example, airplanes and automobiles are much safer now than they were in their infancy. It is my belief that AI research, if appropriately funded and directed, will help ensure that the AI of the future is safer than the AI of today, and that the best is yet to come.
AI research has been my life’s work, and I am proud to have been a part of its development. In the past 50 years, since my undergraduate days at Stanford, I’ve seen AI go through not one, but two “winters” – and now it feels as if we are emerging into a very interesting “spring.” I am beyond excited to see what’s next, and to know that at ICT, as we celebrate our 25th anniversary, we are continuing to forge ahead into a brilliant AI future.
//
Further Reading
Swartout, W. Lessons learned from virtual humans. AI Magazine 31, 9–20 (2010).
Williams, J. & Abrashkin, R. Danny Dunn and the Homework Machine. (Scholastic Book Services, 1958).