ICT Research Presentation at EMNLP 2025

Published: December 2, 2025

By Yunzhe Wang, second-year Computer Science PhD student, Viterbi School of Engineering; Researcher Human-inspired Adaptive Teaming Systems (HATS) Lab, USC ICT

Yunzhe Wang is a second-year Computer Science Ph.D. student at USC in the Human-Inspired Adaptive Teaming Systems (HATS) Lab, advised by Volkan Ustun and William Swartout, focusing on human-AI collaboration in multi-agent systems as well as large-scale multi-agent simulation. In November, he presented “Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations” at the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP).

Recent breakthroughs in large language models (LLMs) have enabled the development of generative agents capable of simulating human-like cognition, memory, communication, and decision-making. Building upon this foundation, research has demonstrated the potential of these agents across diverse domains, from economic modeling and large-scale societal dynamics to crisis simulations, offering a powerful alternative to traditional rule-based systems.

Yet, emerging research has shown that when LLM-based agents are deployed in complex, dynamic environments, their behaviors can diverge significantly from real human responses. Agents often over-cooperate, fail to react appropriately to uncertainty, or produce unrealistic group-level behavior patterns despite appearing reasonable at the individual level.

My work addresses this challenge by extending generative agent simulations to high-stakes, human-interactive environments. We developed the first LLM-driven simulation designed specifically for studying civilian and crowd responses during Active Shooter Incidents (ASI), which is a uniquely sensitive and consequential setting where realism matters and empirical data is scarce. At the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), I presented our paper “Implicit Behavioral Alignment of Language Agents in High-Stakes Crowd Simulations” (Yunzhe Wang, Gale M. Lucas, Burcin Becerik-Gerber, Volkan Ustun) which introduces PersonaEvolve (PEvo), a framework that improves behavioral realism by iteratively refining agent personas to match expert-validated patterns. This approach offers a path toward more reliable and trustworthy simulations in domains where controlled real-world experimentation is limited or ethically challenging to collect.

Building the Simulation

The simulation, developed in Unity, models 80 civilian agents and a single shooter within a detailed school environment. Civilian agents employ a ReAct-style LLM architecture that allows them to observe, memorize, communicate, and reason—all through textual interaction. The system operates under partial observability, meaning agents only perceive their immediate surroundings, hear local conversations, and interact contextually. This design captures a key feature of real human experience: acting under uncertainty.

Implicit Behavioral Alignment

The Persona-Environment Behavioral Alignment (PEBA) framework, implemented through PersonaEvolve, refines how agent personas are tuned to produce authentic behavior. Instead of issuing explicit directives, such as instructing an agent to hide or flee, the model adjusts underlying persona traits like emotional resilience or risk tolerance. These subtle refinements lead to implicit alignment, where complex, realistic behaviors emerge organically from the agents’ interactions.

Results and Expert Observations

In testing, this method consistently yielded more human-like and diverse crowd dynamics than traditional reinforcement learning (RL) or heuristic-based models. Experts reviewing the system observed spontaneous group formations, coordinated concealment, and collective interventions, behaviors rarely seen in explicitly directed simulations.

A comparative analysis revealed further distinctions: RL-based simulations often achieved situationally appropriate actions but suffered from rigidity, while heuristic approaches produced overly synchronized responses. The PEvo system achieved a stronger balance between contextual realism and interpretability, though continued improvements are needed to reduce occasional “freezing” behaviors and improve environmental awareness.

Research Foundations

Before beginning my Ph.D., I worked as a Software Engineer at Bubble.io, developing LLM-powered agents for no-code web applications. I earned my master’s degree from Columbia University, where I conducted research in the Creative Machines Lab on machine learning and robotics.

Currently, as a second-year Ph.D. student in Computer Science at USC, I study in the Human-Inspired Adaptive Teaming Systems (HATS) Lab, advised by Dr. Volkan Ustun and Dr. William Swartout. My research focuses on human-AI collaboration in multi-agent systems and the scalability of generative simulations for social modeling.

Looking Ahead

Future work will extend the PEvo framework to new domains, including public safety, disaster response, governance theory, and urban planning. By operationalizing generative social science through computational simulation, this research explores how artificial agents can safely replicate the complexity of human societies—opening the door to studies that would otherwise be impossible or unethical to conduct in the real world.

Back