Spatial the Final Frontier

Published: March 4, 2024

by David Nelson, Director, Mixed Reality (MxR) Research and Development, USC-ICT

With the release of the Apple Vision Pro, more and more people are hearing the term “Spatial Computing.” As the Director of the MxR Lab, for Mixed Reality Research and Development at USC Institute for Creative Technologies, I am invariably asked by friends and family, and the occasional tech journalist, what this means. For someone not familiar with Mixed Reality (MR) technologies, a simple way to describe Spatial Computing is that it’s like the Internet in three dimensions. Less than a year ago some might have called it the “Metaverse” but even back then I was prone to say that the “metaverse” is a “metaphor”, a word attempting to encompass all of the speculative ideas we have about the way people are going to interact with digital content from this point forward.

When asked for examples, presently I’d point to enterprise applications: using Augmented and Mixed Reality displays like the Hololens or Magic Leap, for manufacturing and maintenance tasks; but also familiar mobile apps like Pokemon Go and Snapchat filters that augment the physical world with virtual content; and then, virtual-reality simulations used for training in the US Army and elsewhere. What of future interactions? Here we’re in uncharted territory; everything from advanced telepresence and superhuman situational awareness, to whole new forms of media and entertainment are potentials on the horizon.

Is this a real groundswell, or another ethereal bubble blown our way by tech bros and investment fund managers? The truth is Spatial Computing has the potential to be revolutionary in the same way that personal computers, mobile computing and the world wide web changed how we do practically everything. That is, if the hardware and software evolve as we imagine, and how they must, to become a part of our everyday lives. Design, weight, resolution, frame rate, battery life, networking all need significant cutting-edge development for Spatial Computing to radically transform our interactions with not just the digital world, but the world at large.

So, when will that happen, this radical transformation? Is it 5 years? Ten? Well, in five years, I’ll certainly have pre-ordered my Apple Vision Pro 14 contact lenses, and I’ll be waiting to hear if my health insurance will cover the surgery for my Meta Quest neural-implant. All kidding aside, as far as tech advances, I think we’ll likely move away from video pass-through into optical Mixed Reality, and with Apple and other tech giants scooping up AI companies like they were collecting Pokémon, I think we’ll see advanced capabilities leveraging computer vision, sensor fusion, large language models and generative AI integrated with Spatial Computing.

However, I believe the most transformative developments will be human-centered. In five years, far more people will be conversant in a language that is barely formed right now. Wondering where this might lead would be like asking Thomas Edison in 1894 what the formula is for a blockbuster movie opening weekend. Today, we may not see exactly how we’ll interface with a digitally-data-rich world, but in five to ten years we may not be able to imagine a world where we don’t.

So, in the near term we question: What will it take for the public to embrace current hardware to interact with Spatial computing? The form factor is a big question. Currently, the majority of our interactions with Spatial Compute are through Mixed-Reality devices, which are inelegant, heavy, expensive and not exactly socially acceptable. These obstacles may be temporary, however in order for things to progress, the present trade-offs have to be worth it. We are still waiting for the canonical “Oakley glasses”; the Mixed Reality device that is as customary as the glasses we already wear. We laughed at the Yuppie in the early 80’s trying to close business deals on a mobile phone the size of a shoe box, and in the early 90s we tolerated slow 56k dial up connections (not to mention that horrific sound of the transmission “handshake”) while we considered the uses of the Internet and wondered if they could ever be monetized.

Today we use our phones for practically everything, and there is effectively no business that doesn’t use the internet in one way or another. To progress, the value of engaging with spatial digital content has to be proven in two fundamental ways, both economically and emotionally. In essence, if it can make us earn something, or make us feel something, Spatial Computing stands a chance.

People used to say that the tipping point depended on the ‘killer app’, or that “Content is King” – but in this case, if ‘Content is King’ then “Use-Case is Queen”. Apple released their mixed reality head mounted display, the Vision Pro less than a month ago. They’re endorsing what they think are the immediate uses: primarily virtual home theaters, and the ability to add limitless screens in our augmented work spaces.

These certainly help to promote the Apple ecosystem (Apple TV, Apple Music, Safari search and other Productivity or Gaming Apps), still, is there something better about these interactions in Mixed Reality then in the physical world that will make strapping a heavy screen onto our faces worth it? As more people exercise Apple’s 15-day return policy, the jury is still out.

However, it is feasible to think that the Vision Pro was not meant to be a device for the average consumer, but rather a trial test-case, an attempt to introduce an aspirational device into the market, a toe dipped into the rising tide of immersive technology. To that end, the Apple Vision Pro offers some great features; the crisp resolution, the more elegant (albeit incredibly heavy) design, the beginnings of a more intuitive user-interface; and a few novel features as well: the depth and LiDAR sensors built into the HMD enable the rendering of compelling 3D stills and videos. Commoditizing this capability opens up the possibility for real experimentation, with users exploring new modes of content creation and distribution. Artists are using the AVP to virtually project their work on walls and canvases enabling them to physically ‘trace’ their work onto other surfaces.

This is what excites me – the new new. The types of things we’ll do with Spatial Computing that are not obvious to us now, and may even seem magical or like something out of science fiction. These are the things that can only be discovered and developed by experimenting with the tools we have today. This is where art and academic research join hands. I am fortunate to work at an institute that embraces this marriage of science and art. At the MxR Lab at USC Institute for Creative Technologies we get to imagine and evaluate the uses of current and future technologies. We focus on training and learning from a human-centered design perspective, always cognizant of the power of storytelling to make things engaging, to make them stick.

A current MxR project in Spatial Computing, has us imagining a user with the ability to interact with their environment much the same way we work on a shared Google document, leaving geo-specific annotations anchored within our surroundings, crowd-sourcing information and delivering it to the user at their specific point of need. Where things get interesting is the technologies’ ability to not only be aware of the environment, but to learn and adapt to the user.

On another project at the MxR Lab we are investigating the value of adaptive Head Mounted Displays. It’s one thing to think I can look at something and my smart glasses will display information about it, but we also need to consider that if all the information is delivered to me visually, not only will I run out of screen real estate, but this method will certainly have cognitive and physiological costs as well. Maybe the optimal way to alert me to something at a given moment is through an audio signal or message, or perhaps some kind of haptic feedback. This is where research is desperately needed; looking at ways an intelligent system might modify in response to my preferences, my cognitive and emotional state and also what I’m trying to do and what’s happening in the environment around me at the time.

Toward the end of the 19th Century Muybridge used multiple cameras to assess a horse’s mechanics when galloping – 100 years later we waited in line to see Star Wars. The Lumiere brothers pointed a motion picture camera at “actualities”, everyday occurrences like trains pulling into stations and people sneezing and today we have ground-breaking documentaries that can revise laws and change culture.

This is what new mediums require. Rigor and vigor. Scientists and artists.

We are in the early days of Spatial Computing, so stand down evangelists and naysayers, bring on the pioneers, the early adopters, empower the experimenters and researchers who embrace play and aren’t afraid to break things, the futurists and realists willing to explore new mediums, seek out new technologies and use-cases, to boldly go where no one has gone before.

Back