By Dr. Nik Gurney, (Interim) Research Lead, Social Simulation Lab, USC ICT and Tyler King, Computer Science undergraduate, Cornell University – and former NSF-funded REU intern at ICT (2022).
AI assistance is becoming widespread in military decision-making and operational planning. With its increased prevalence, the need to detect when personnel use unauthorized AI assistance can be mission-critical.
In the Social Simulation Lab at ICT, we recently demonstrated AI assistance detection accuracy exceeding 80% for a complex decision task. Importantly, this detection was achieved without requiring knowledge of the AI system’s internal algorithms. To reiterate, our methods were able to detect when human operators had AI assistance without needing to know how the AI works. We do this detection by focusing on human search behavior patterns.
The rise of artificial intelligence (AI) has introduced a new challenge: how do we determine when an AI has played a role in completing a task? This question is not just theoretical; it has serious implications for academia, industry, and ethics. AI assistance is becoming more common in generating text, solving complex routing problems, and even making medical diagnoses. While AI can be a powerful collaborator, there are cases where it is crucial to distinguish between human effort and machine intervention. In this article, we explore the methodologies and challenges involved in detecting AI assistance in abstract and complex tasks.
The Growing Importance of AI Detection
The need to differentiate between human and AI efforts has been anticipated since Alan Turing proposed his famous test for machine intelligence. However, recent advances in AI systems, such as OpenAI’s GPT models and AlphaFold for protein folding, have blurred the lines between human and machine-generated work. This is especially problematic in fields where authenticity, authorship, and originality are paramount.
Educational institutions, for example, are grappling with AI-generated essays, while scientific research must ensure that conclusions are derived from genuine human insight rather than algorithmic extrapolation. Similarly, in safety-critical applications like autonomous driving, understanding whether AI played a role in decision-making can determine liability in accidents.
The Challenge of Detecting AI Assistance
Detecting AI involvement is particularly difficult when the data produced is abstract and does not contain obvious markers of machine generation. Unlike AI-generated text, which may contain telltale stylistic patterns, other complex tasks—such as problem-solving in multidimensional spaces—do not provide clear human-discernible clues.
Consider a scenario in which participants adjust digital dials to optimize an unknown variable in a complex system. In its simplest form, this type of task is akin to using dials to explore a landscape, one dial allowing the explorer to move north-south and the other east-west. The topography of the landscape can make a task, such as discovering the tallest peak, a complex task. Increasing dimensionality serves to multiply the complexity (for example, a time dimension). The decision-making process in such a task can be influenced by AI suggestions, but the final output might not reveal direct traces of AI involvement. Traditional detection methods, which rely on superficial characteristics, often fail in these cases. Instead, more sophisticated techniques must be developed.
A Data-Driven Approach to AI Detection
To detect AI assistance we recast the problem as a classification task—how likely is it that an observed behavior is entirely human—using deep learning models. Instead of looking for specific AI-generated artifacts, we can analyze user behavior to infer whether an AI played a role in decision-making.
Our research leverages a dataset in which human participants performed optimization tasks with and without AI assistance. The tasks involved tuning dials to explore a simulated environment and identify the highest possible outcome, conceptually similar to searching a landscape for the tallest peak. This task can be relatively simple (e.g., if an explorer is near Mt. Fuji the only need to progress uphill), or extremely complex (e.g., if an explorer is in Badlands National Park they will need to ascend multiple peaks and survey different regions). By comparing the behavior of participants who worked alone to those assisted by AI, we identified patterns unique to human search strategies and others that suggest AI influence.
We transformed this data into image representations, allowing deep learning models to classify AI-assisted efforts. Each image encapsulated the participant’s search path through the problem space, with additional channels encoding metadata such as exploration strategies and decision patterns. This approach made it possible to apply convolutional neural networks (CNNs) to distinguish between human and AI-aided efforts.
Key Findings: How AI Assistance Manifests
Our analysis revealed several key insights into how AI assistance can be detected:
- Differences in Exploration and Exploitation – Humans and AI tend to navigate complex tasks differently. Humans often explore cautiously, testing small variations before making significant changes. AI, on the other hand, optimizes decisions more aggressively, exploiting discovered information more efficiently. Encoding these behavioral differences into our models significantly improved classification accuracy.
- Predictable Search Patterns – AI-assisted participants exhibited structured and repeatable search patterns, likely due to the AI’s reliance on algorithmic heuristics. Humans, by contrast, showed more variance in their decision-making, often re-exploring previously tested areas.
- Performance in Simple vs. Complex Tasks – AI assistance was easier to detect in more complex problem spaces. In simple environments, human strategies often mimicked AI behavior, making classification more difficult. However, in nonlinear or multi-peak environments, AI assistance became more apparent due to its efficiency in finding optimal solutions.
- Neural Network Performance – Our deep learning models, particularly a combination of CNNs and recurrent neural networks (RNNs), achieved over 86% accuracy in identifying AI-assisted efforts. The best results came from architectures that incorporated temporal exploration-exploitation patterns, reinforcing the idea that decision dynamics are key to detection.
Implications and Future Directions
The ability to detect AI assistance has broad implications. In education, tools based on this research could help assess student originality in AI-enhanced learning environments. In cybersecurity, detecting AI involvement in digital transactions could be crucial for fraud prevention. Similarly, in research and policy-making, ensuring the integrity of human-generated conclusions remains paramount.
Future work should explore even more abstract tasks, where AI influence might be more subtle. Additionally, integrating explainability techniques into AI detection models could provide transparency into why a decision was flagged as AI-assisted, helping users understand the classification process.
Ultimately, as AI continues to integrate into human workflows, developing robust AI detection methods will be essential for maintaining trust, accountability, and ethical standards in an increasingly AI-assisted world, especially in a military context, where life and death are on the line.
//