ICT TO PRESENT LATEST RESEARCH AT I/ITSEC

Published: November 21, 2024
Category: News
ICT TO PRESENT LATEST RESEARCH AT I/ITSEC

By Dr. Randall W. Hill Jr, Vice Dean, Viterbi School of Engineering, Omar B. Milligan Professor in Computer Science (Games and Interactive Media); Executive Director, ICT

On the second floor of our Playa Vista (“Silicon Beach”) HQ, a project team, led by ICT’s Allison Aptaker, is marking out an area 10 feet by 20 feet. Aptaker’s support team, including Bayley Camp, Operations Officer, Defense and Intelligence Initiatives, Cesar Otanez and Dhairya Parmar, both from our IT department, are hauling tables, monitors, HDMI cables, extension cords and rolling chairs to set up five “stations” for our researchers. 

We do this every year to prepare for the Interservice/Industry Training, Simulation and Education Conference (I/ITSEC), rehearsing our presentations and demos ahead of time. 

Here are the researchers who will be at our I/ITSEC booth this year: 

Soon, everything will be packed up and shipped to Orlando, Florida, in time for Aptaker’s crew to fly over from Los Angeles, CA and set up Booth 2135 in the South Concourse of the Orange County Convention Center. 

I/ITSEC SIGNIFICANCE

The Interservice/Industry Training, Simulation and Education Conference (I/ITSEC) promotes cooperation among the Armed Services, Industry, Academia, and various Government agencies in pursuit of improved training and education programs, identification of common training issues, and development of multi-service programs.

If you’ve never been to I/ITSEC, it’s hard to convey the sheer scale of the event. According to the I/ITSEC official site, last year, there were 517 companies occupying 467 exhibit spaces in 210,000 net square feet, with 9,150 visitors to the exhibition hall, and 6,850 personnel (including the team from ICT listed above). No less than 60 countries are represented by both exhibitors and visitors. Truly an international event. 

I/ITSEC launched in 1966 – when it was known as the Naval Training Device Center/Industry Conference. Over the years the conference expanded to include The Army, Air Force, Marine Corps, Coast Guard, Industry, and Academia and now recognizes the increased importance of Manpower, Personnel, and Training aspects in the systems acquisition process.

As a University Affiliated Research Center (UARC), sponsored by the US Army, we are proud to present our demos within the DEVCOM Soldier Center area at I/ITSEC. This gives us at least a once-a-year chance to see many of our colleagues from Natick in person, and have conversations which often lead to new projects. 

I/ITSEC has been on ICT’s calendar since we launched in 1999 (and if you want to know more about our Origin Story, I wrote about that here ahead of our recent 25th anniversary). I will be attending I/ITSEC this year, as in previous years, alongside other members of our Leadership team including our Chief Science Officer Dr. William Swartout (who recently wrote about his 50 Years in AI), and Ryan McAlinden, Director for Defense and Intelligence Initiatives (his essay: Our Responsibility to the US Army is of particular relevance to our presence at I/ITSEC). 

I also encourage you to head to the NEXT BIG THING: HUMAN & MACHINE TEAMING session, in the Destination Lounge (OCCC Lobby) on Tuesday Dec 3rd from 1600 – 17:30 PM. This important discussion will be moderated by ICT’s Program Manager Dr. Keith Brawner, who sits within the US Army DEVCOM Soldier Center, where he is also the lead for AI initiatives. Dr. Brawner wrote a very interesting essay for ICT 25 titled: The Unique Nature of ICT, which I encourage you to read. 

Talking of interesting articles, we welcomed Stew Magnuson, the Editor in Chief of National DEFENSE Magazine, to ICT earlier this fall, and the result was a great three-page feature on ICT. Many of our I/ITSEC demos were previewed for Magnuson and we are grateful for the exposure. So if you are planning on visiting us at I/ITSEC, it’s a useful primer on our research. 

ICT SPECIFICS AT I/ITSEC 2024

In this I/ITSEC Briefing, I thought it might be helpful to provide details of our publications accepted for presentation, and when they are scheduled, as well as the demos which will be on display in Booth 2135. If you’re planning on coming to I/ITSEC, please drop by. We would be very pleased to meet with you and show our work. 

PUBLICATIONS ACCEPTED FOR PRESENTATION 

  1. Enhancing Operational Decision-Making with Adaptive Head-Mounted Display Interfaces | Project Leader: David Nelson, Director, Mixed Reality (MxR) Lab | Paper Presentation: TUES DEC 3rd, ROOM 320E, 1600 – 1730 hours
Background 

Recently Marine Forces Special Operations Command specialists said if they could have just one piece of technology to overcome operational challenges it wouldn’t be a shiny next-gen weapon or vehicle, but hardware or software that would allow them to fuse the deluge of data into a single system and display. Army officials acknowledged they were “struggling” with data integration and display in their Project Convergence experiments designed to support JADC2. 

Objectives 

In this applied research effort, the MxR Lab is developing prototyped interactions investigating future adaptive user-interfaces in head-mounted-displays (HMDs), focusing on models to improve user experience and efficacy. Utilizing a Command Center use-case, AHMDI seeks to address the challenge of displaying data on 3D terrain, by finding a balance between providing essential information and minimizing visual clutter, considering factors such as user-context and the platform’s visualization capabilities. The goal is to create a conceptual framework for an adaptive interface tailored to warfighter needs, enhancing decision-making and informing the Army’s future modernization priorities for HMDs. 

Results 

Year 1 leveraged results from valuable laboratory tests performed by ARL-W researchers and applied them into a representative real-world use case, in order to explore User Interface interactions between a performer and an intelligent system. Higher task performance and usability ratings for the adaptive condition would support the hypothesis that a well-designed adaptive interface will outperform a comparable non-adaptive version in complex decision making scenarios. 

Subject matter expert, retired Colonel Jay Miseli, a 1995 graduate of the United States Military Academy, and career Armored Cavalry officer has joined the team as a consultant bringing expertise in planning and executing military operations in combat, training, and exercises.

Next Steps

The MxR team are set to begin a Year 2 formal user study, investigating the effectiveness of adaptive and adaptable user interface features within a mixed-reality (MR) simulation that includes high-cognitive-load situations. Effectiveness of adaptive and adaptable features will be compared based on significant differences in task measures (e.g., speed of decision-making, quality of decisions, post-tests of situational awareness, and sense-making) and self-reported usability and perceived performance. The team hopes to gain valuable insights regarding which conditions these features best support users in tasks requiring intense cognitive engagement and which approach to information flow will enhance sense-making, decision-making, and situational awareness in complex mission command scenarios.

  1. Open-Vocabulary High-Resolution 3D (OVHR3D) Data Segmentation and Annotation Framework | Project Leader: Dr. Andrew Feng, Dr. Meida Chen | Paper Presentation: TUES DEC 3rd, ROOM 320D, 1630 hours 
Background 

The U.S Army and Navy have exhibited a strong interest in the field of rapid 3D reconstruction of battlefields, alongside virtual training and simulation tasks. 3D data, particularly when annotated, is essential for these applications. The use of unmanned aerial vehicles (UAVs) for the collection of aerial images and the application of photogrammetry techniques enables the swift collection and reconstruction of high-fidelity and geo-specific 3D terrain data. However, traditional manual annotation methods (KpConv, RandLA, Mask3D) are considered both costly and time-intensive and the annotation of 3D data consistently presents significant challenges, given the inherent scarcity of manually annotated 3D datasets, particularly for the military use cases. 

Objectives 

Recognizing this gap, our previous research leverages the data repository’s manually annotated databases, as showcased at I/ITSEC 2019 and 2021, to enrich the training dataset for deep learning (DL) models. However, collecting and annotating large scale 3D data for specific tasks remains costly and inefficient. To this end, the objective of this research is to design and develop a comprehensive and efficient framework for 3D segmentation tasks to assist in 3D data annotation. This framework integrates Grounding DINO (GDINO) and Segment-anything Model (SAM), augmented by an enhancement in 2D image rendering via 3D mesh. 

Furthermore, the authors have also developed a user-friendly interface (UI) that facilitates the 3D annotation process, offering intuitive visualization of rendered images and the 3D point cloud.

Results

To evaluate the proposed annotation framework, outdoor scenes from collected by using unmanned aerial vehicles (UAVs) and indoor scenes collected by using NavVis VLX and RGB-D camera in USC-ICT office building were used to conduct comparative experiments between manual methods and the proposed framework, focusing on 3D segmentation efficiency and accuracy. The results demonstrate that our proposed framework surpasses manual methods in efficiency, enabling faster 3D annotation without compromising on accuracy. This indicates the potential of the framework to streamline the annotation process, thereby facilitating the training of more advanced models capable of understanding complex 3D environments with satisfactory precision. 

Our results demonstrate the framework’s efficiency and potential in creating high-quality 3D annotated datasets across various settings. By optimizing this process, the framework minimizes reliance on manual labor, thereby boosting overall efficiency and productivity. In practical terms, the framework’s ability to annotate 3D data with an open vocabulary not only enhances accuracy but also expands the scope of applications in military simulations.

Next Steps

The capability of the proposed framework to annotate objects accurately in military-related scenes suggests its potential in automating tasks in the U.S. Army M&S. By reducing manual effort and enhancing annotation efficiency, the framework can significantly contribute to improving human operation and decision-making within military modeling and simulation applications. By automating and refining the annotation process, the framework empowers military personnel to focus on higher-level tasks, ultimately enhancing decision-making capabilities and operational readiness.

  1. SAR-AR: Adapting Human Vision to Complex Sensing Technologies with Adaptive Synthetic Aperture Radar Image Recognition Training | Project Leader: David Nelson, Dr. Benjamin Nye. Collaborators: Dr. Benjamin Files, Dr. Kimberly Pollard, ARL-W| Paper Presentation: WED DEC 4th, ROOM 320E, 1230 – 1500 hours
Background 

Synthetic Aperture Radar (SAR) is a remote sensing technology which is used to produce imagery of the earth’s surface from high altitude satellites and other sensor platforms, by sending radar pulses of microwaves to the surface, to generate images of the scanned area. However, SAR output images are challenging for humans to interpret due to geometric distortions, multipath reflections, and other phenomena. 

Computer vision systems automatically search for relevant or important images, but rely on vast stores of accurately (human)-labeled datasets to power machine learning. Existing labeled SAR datasets are insufficient. To use and adapt computer vision with SAR imagery successfully we need a way for humans to produce an abundance of better labeled SAR images faster. 

Objectives 

ICT’s MxR Lab and Learning Sciences team have collaborated with ARL-West researchers to construct an alternative approach: an interactive training interface that focuses on augmented, body-based perceptual learning. By approximating the natural experiences that lead humans to build their visual understanding of 3D objects in the regular, optical world, the project hopes to enable users to build an intuitive understanding of SAR imagery. 

In effect, allowing humans to “see like a satellite.” 

Results 

The teams have created an augmented reality (AR) prototype of an interactive system that allows a user to experience different SAR viewing angles while physically moving their body to the angular positions that a SAR satellite might take above a scene. 

Utilizing a mobile phone or tablet, this AR interaction enables the trainee to view a semi-realistic virtual environment scene that features significant objects (e.g., tank, helicopter, rocket-launcher) anchored to a flat surface, in an intuitive way. The phone/tablet simulates the view of an aerial sensor and the user can switch between the more realistic Electro-Optical (EO) image and the SAR image, providing a spatially accurate A/B comparison.

Next Steps

The hypothesis is that body-based interactions with 3D augmented reality generated content will yield even greater improvements in users’ ability to recognize objects in SAR and will be useful to gather critical training data. This data is invaluable, as it will not only record the metrics of image labelers in training, but may also be used to refine machine learning algorithms aimed at boosting automated SAR image interpretation capabilities in the future.

DEMOS

Here are the demos which will take place in the ICT Booth (2135) this year. 

  1. Watercraft and Ship Simulator of the Future (WSSOF)
    Presenter: David Nelson
Background 

The Army requires a Multi-Domain Operations (MDO) ready force that can seamlessly transition from one domain to the other, such as those during maritime littoral operations. The WSSOF is a tool that can improve transition efficiency by enabling the warfighter to apriori investigate littoral zone operations in a manner that exercises all possibilities of environmental conditions such as wave heights, winds etc. in a manner that is safe and effective. 

Objectives 

This collaborative effort between the USC’s Institute for Creative Technologies, the Viterbi School of Engineering, and ERDC’s Coastal and Hydraulics Laboratory (CHL), will foster a leap forward in hyper-realistic ship motion using improvements in numerical simulations of vessel motion, Information Technology (IT), as well as computational speeds to enable the physics-based and real-time simulation of these interactions. Contemporary ship simulations are large, bulky, and stationary with minimal portability. 

To mitigate these limitations a network enabled ship simulator equipped with Virtual and Mixed-Reality technology coupled with high fidelity numerical modeling can provide a means to meet these operational needs and enhance ship survivability in operational deployments. 

Results 

The team made significant progress in year 1 on the research and development of the WSSOF application. The interdisciplinary team successfully developed a VR-enabled system that allows a single user to pilot a vessel within a geo-specific coastal environment. This was accomplished by combining littoral zone wave physics models with a water rendering system in a 3D game engine. This combined system results in visually appealing, dynamic, realistic, and real-time near shore waves within the simulation environment. Additionally, the team created an enhanced terrain loading pipeline to accurately visualize geo-specific environments like Miami Beach and Duck, North Carolina. This pipeline leverages DEM data from various sources and combines it with bathymetry data.

Next Steps

The team is currently seeking sponsorship to support continued research and development of the WSSOF application. Future development plans for the project will enhance the application with the integration of a modular user-interface, additional terrain and vessel models, and network protocols for multi-user simulations, promising safer, cost-effective, and immersive preparation for maritime operations and training.

  1. Arc- AIRCOEE D
    PRESENTER: Dr. Benjamin Nye
Background 

AIRCOEE (AI Research Center of Excellence for Education) is a two-year $4.5 million dollar collaboration between the University of Southern California and Army University to address two fundamental questions: “How do we use AI to improve education?” and “How do we upskill our population in AI and prepare them for the jobs of the 21st century?” 

Funded through the AIRCOEE, the AI-Assisted Revisions for Curricula (ARC) project aims to support Army developers who maintain courses by recognizing changes in doctrine, policy or manuals that impact curriculum content such as lesson plans or slides through identifying individual slides, sections or references that may need updating and in some cases suggesting changes. 

Objectives 

When new doctrine and manuals are introduced, extensive time is spent identifying and updating relevant training materials. To speed up what is currently a manual process, the ARC tool will process a variety of document types (e.g., PDF doctrine, PowerPoint slides, Word lesson plans), make connections between documents (e.g., matching sections of old and new doctrine, linking references to doctrine), recognize when referenced material has changed, and potentially suggest changes (e.g., update terminology). This work focuses on three problems:

Indexing Training Materials: Text analysis pipelines for training materials to extract and tag meaningful passages (e.g., “ADP 6.0 Section 1-5”) and metadata. Collecting a corpus which includes both current and prior versions of doctrine.

Change Analysis: For any passage in a training document, analyze if the relevant doctrine passage(s) were substantially changed in new doctrine versions.

Ranking Document Updates: Applying change analysis at the document level, search and rank which documents should be reviewed for updates.

A hybrid approach leveraging classical search algorithms as well as modern transformed-based models detects connections and identifies changes. Color coding allows course developers to easily inspect and confirm results (e.g., green indicates a close match between old and new doctrine, yellow indicates substantial changes, red indicates new doctrine does not cover the material) and take action as needed (e.g., ARC may suggest terminology updates based on changes to glossary terms). ARC offers the potential to greatly accelerate updating content, enabling greater relevance of content to the latest doctrine and best-practices. 

Results

Data collection is critical to ARC and LTC Fortuna of Army University is leading the effort in building an archive of Army doctrine documents as well as associated slides and lesson plans. Using the initial version of this corpus, the first version of the ARC tool was developed with the ability to process PDF doctrine and PowerPoint slides allowing comparison of old and new doctrine (e.g., all the paragraphs in the old doctrine are colored green for a good match, yellow for a partial match and red for no match in the new doctrine) as well as comparison with PowerPoint slides (e.g., find match in old doctrine for a slide then look for match in new doctrine). Based on the ability to analyze changes for an individual document, a scoring algorithm is being developed to search for and rank training resources that are most likely to need updates due to a change in doctrine.

Next Steps

Ongoing work is exploring Army-specific AI tools (large language models trained on doctrine, such as TracLM) and support for a wider variety of lesson plans and resources. User testing will guide this process as well as exploration of different Army focus areas (e.g., sustainment, recruitment, medical). ARC is also prototypes that integrate with Microsoft SharePoint as well as Army-specific tools (e.g., Central Army Registry) would allow access to a larger collection of doctrine and training materials. This work would pave the way to enable Army users to test and leverage the system in combination with tools that they already use for lessons (e.g., MS Word).

  1. Geospatial Terrain
    PRESENTER: Dr. Andrew Feng, Associate Director, Geospatial Terrain Research
Background

The Geospatial Terrain team focuses on constructing high-resolution 3D geospatial databases for use in next-generation simulations and virtual environments. Utilizing both the commercial photogrammetric solution and our in-house R&D processing pipeline, the 3D terrain is procedurally recreated using drone platforms and EO sensors.

Objectives

The Geospatial Terrain team focuses on research and development to create the high resolution terrain reconstructions and corresponding run-time terrain simulation capabilities for Military users. The end-to-end pipeline streamline the data capture, 3D processing, and semantic segmentation to produce simulation-ready high-resolution terrain. Our current research includes neural 3D terrain reconstructions for large scale environments and semantic terrain segmentations that utilize zero-shot AI models to extract object information for building aperture.

Results

To date, ICT has produced 3D terrains through the UAV collections of over a hundred different sites to support the terrain research efforts for US Army STTC. Our technology is also an integral part of Army’s One World Server (OWS) system, which utilizes our semantic terrain processing algorithms for creating and processing simulation-ready high-resolution 3D terrain from low-altitude UAV collections. We have developed terrain simulation applications including line-of-sight analysis, path finding, real-time terrain effects, concealment analysis, and so on. 

We are also actively developing new technologies to improve the terrain generation process. Our recent research focus will investigate neural-based methods for terrain reconstructions and zero-shot models for 3D segmentations.

Next Steps

For future research, we aim to address the current limitation of the photogrammetry process for creating 3D terrains. Specifically, we plan to develop the capability for photorealistic reconstruction and visualization of the geospatial terrain using neural rendering techniques. 

  1. 3D Terrain Completion Tool
    PRESENTER: Dr. Yajie Zhao
Background 

Our method extends from a family of methods characterized by learning priors from discrete latent codes that are obtained from a vector-quantized autoencoder. Past research in this direction has only focused on image synthesis, from directly predicting pixels as word tokens, to predicting tokens encoding visual features of larger receptive fields. While the pioneering works infer latent codes auto-regressively, MaskGIT finds it beneficial to synthesize an image in a scattered manner with a bidirectional transformer. In every iteration, several new codes are predicted in parallel and inserted into scattered locations of the code map until the entire grid is filled. While it has partially adapted its bidirectional framework to the image inpainting setting, our method design addresses several unanswered aspects of this adaptation: how partial images can be robustly masked into latent codes, and how the latent codes should be decoded into synthesized pixels that respect the observable area. 

Objectives 

Traditional large-scale terrain 3D reconstruction pipelines often result in artifacts and holes in the processed textures and geometries. This typically occurs due to missing views or occlusions, which fail to produce point clouds during the Structure from Motion (SfM) pipeline. 

Highly skilled artists are then required to manually define and fix these regions before converting the 3D reconstructed terrain into a high-quality simulation environment for soldier training or visualization in path planning. We propose using AI-based methods to post-process raw 3D terrain models obtained from commercial structure-from-motion pipelines. Our aim is to automatically detect, fix, and complete large 3D models in both texture and geometry. This approach will significantly improve efficiency, reduce costs, and simplify the process of providing high-quality 3D environments for downstream use.

Results

VGL achieved the goal by proposing a novel 2D inpainting neural network and integrating a metric depth estimation algorithm into the inpainting process. We begin with automatic hole detection in 2D and apply inpainting trained on one million indoor and outdoor images. Next, we predict monocular depth based on the inpainted 2D image. Finally, we project the single-view depth onto the real-world metric from the SfM pipeline. Using the inpainted depth and texture, our tool produces a completed mesh and texture that integrates smoothly with the original 3D reconstructed models. The proposed inpainting method was accepted by CVPR 2024 as “Don’t Look into the Dark: Latent Codes for Pluralistic Image Inpainting,” which achieved the best inpainting results to date.

  1. Autonomy-Mediated Trust
    PRESENTER: Dr. Jonathan Gratch
Background

This is a collaboration between ICT and US Air Force Academy (USAFA) and was performed as part of a capstone exercise by some USAFA cadets.  The overall project explores trust in the advice giving of AI systems in situations involving ethical ambiguity. This is part of a larger effort to research trust in autonomy and human-machine teaming, but the output has relevance as a classroom exercise to discuss issues around the Law of War and the use of AI. It was utilized in several classrooms at USAFA this year.

Objectives 

The specific demo involves a hypothetical conflict where junior officers must make a strike/no-strike decision involving potential harm to civilians and civilian infrastructure. Users will be presented with a scenario and have a conversation with an AI advisor recommending one of the options. Visually, the advice will be delivered via the Furhat robot. For the original study, students saw pre-recorded interaction, but we will use GenAI and speech understanding to make a more interactive scenario and simplify details of the exercise to ease user understanding.

See you in Orlando! We’ll be posting regular updates on our site and socials, including LinkedIn, so do follow us there and DM for more information or to arrange demos and Media interviews.