ICT is running a series of articles to highlight the work of our Postdoctoral Researchers. In this essay we hear from Dr. Haiwei Chen, who was awarded a PhD in Computer Science, from the Viterbi School of Engineering in 2024, and is now continuing his research on generative AI, computer / 3D vision at ICT’s Vision and Graphics Lab (VGL), under the supervision of Dr. Yajie Zhao, Director, VGL.
BYLINE: Dr. Haiwei Chen, Postdoctoral Researcher, Vision and Graphics Lab, USC Institute for Creative Technologies
In 2018, I took a class on digital geometry analysis and the professor said: “In the near future, there will be no difference between computer vision and computer graphics, as both will rely on deep learning.”
Back then it sounded like a bold prediction, but it definitely stuck with me. It gave me insight into a shift in thinking: computer vision and graphics were becoming part of the same problem space. One focuses on understanding the real world from images (vision), and the other on creating realistic images from models (graphics). Deep learning was starting to connect the two.
I was fascinated by what this meant. If we could unify vision and graphics, then data captured from the real world — a photo of my bedroom, a street in Los Angeles, or a famous landmark — could become part of a virtual simulation. These simulations wouldn’t just be realistic; they would be based on reality itself. That was the moment I knew what I wanted to work on during my PhD.
Coming to ICT
I did my undergraduate studies at the University of North Carolina at Chapel Hill, where I earned a double major in Computer Science and Sociology. During college, I had the chance to work in the telepresence group led by Dr. Henry Fuchs, and he was one of the first people who inspired me to pursue research. He said, “Find something that looks fun. As long as you stay curious, you’ll keep learning.”
Those words stayed with me, and I spent two years in his lab working on VR and AR technologies.
Later, he introduced me to Dr. Evan Suma, who at that time was Associate Director of the Mixed Reality (MxR) Lab at USC ICT. I visited USC, toured some of the labs, saw the famous Light Stage, and tried a few VR demos. Dr. Suma told me, “I think our interests match well.” I agreed, and soon after, I joined ICT as a PhD student.
At the MxR Lab, my early research focused on VR interaction, especially a technique called “redirected walking.” It allows users to explore large virtual spaces while staying inside a small physical tracking area. More details on my take on the technique can be found here.
This work was fun and technically challenging, but over time, my interests shifted. I became more curious about how deep learning could be used to understand and generate 3D data.
This brought me to the Vision and Graphics Lab.
ICT Vision and Graphics Lab (VGL)
After joining VGL, my focus shifted towards applying deep learning to computer vision problems. My attention is specifically drawn to representations – how images and 3D objects are represented in a neural network. Consequently, this also led to a major focus on neural operators – how neural networks should be designed to process visual data of a certain representation. Based on this analytical framework I have published papers on several topics: SE(3) equivariance on point cloud, generative network on implicit field, and autoregressive inpainting.
Looking back, my research path has always followed my curiosity. From telepresence to VR interaction, and now to 3D deep learning, I’ve been driven by the excitement of solving new problems and learning how things work.
Some of the breakthroughs in AI that happened between 2016 and 2018 also played a big role in shaping my research direction. During that time, neural networks were suddenly able to do things that seemed impossible just a few years earlier: paint realistic 2D images, reconstruct 3D shapes from a single image, or find dense correspondences between datasets without hand-designed features. These advances made me ask a simple question: what else can neural networks do?
That question has been my guide ever since. I started reading papers on 3D deep learning and got deeper into the field. I was especially interested in how to design architectures that respect the structure of 3D data, like point clouds or meshes, and how to train models that can generalize across different scenes and object types.
During my PhD, I had the chance to present my work at several top conferences, including IEEE VR 2018 in Reutlingen, Germany, CVPR 2021,2022 and 2024 in the United States, and SIGGRAPH ASIA 2022 in Daigu, Korea. I hope one day I can present at SIGGRAPH in the U.S. too. To me, SIGGRAPH is where visually impactful research gets the spotlight, and it would mean a lot to contribute to that legacy.
Behind the Scenes at ICT
There are also many lighter moments from my time at ICT that I remember fondly. One day, I came back to the lab and saw NBA star Klay Thompson standing near the door. I recognized him right away. He told me the office looked nice, and tried to chat, but I was too nervous to say much beyond, “Yeah, I like this place.” And yeah, I still wish I had asked him a few questions.
There was also a mysterious room on the third floor of ICT that my labmate and I used to think was a recreation room. No one ever used it, so we started hanging out there during deadline crunch times, playing board games late into the night.
One day it was locked, and I never got to go back in (in fact I think one of my games might still be in there.)
Postdoctoral Research Focus
Now, as a postdoctoral researcher, I find my perspective changing again. During the PhD, it’s normal to have tunnel vision — to spend months or even years focused on a very specific problem. Maybe you’re trying to understand why a state-of-the-art method fails in edge cases, or how to improve it with a new loss function or architecture tweak. That kind of focus is necessary, and it teaches you a lot.
But over time, I started asking bigger questions:
Why does this research matter?
What impact could it have on the field as a whole?
In my postdoc, I want to take a broader view. I want to work on problems that not only improve performance, but also deepen our understanding of how vision and graphics connect. I believe that as these two areas continue to merge, new forms of simulation, content creation, and understanding will emerge. And I want to be part of shaping that future.
Success in research doesn’t usually come from one defining moment. There are many milestones: publishing a paper, finishing a difficult experiment, getting a grant. But if I had to choose a turning point for myself, it would be when I finished my first CVPR paper.
Before that, I often felt like I didn’t belong in the deep learning community. Like many PhD students, I dealt with imposter syndrome. I had spent years learning the basics, but I still doubted whether I had what it took to contribute.
That paper, on SE(3)-equivariance for point cloud processing, changed everything for me. The work was hard. It took multiple rounds of ablation studies, debugging, and rewriting to get it right. But once it was done, I felt a shift in my confidence. I had been involved in every step: designing the model, running experiments, analyzing results, writing and revising the paper.
After that, I felt I had truly learned how to do deep learning research.
These days, the work is still challenging. But I approach it differently. I know that with enough time and effort, I can solve hard problems. That mindset is one of the most valuable things I gained from my PhD.
What’s Next?
As I look ahead, I remain excited about the future of vision and graphics. Generative AI is opening up new possibilities for creating and interacting with digital content. Neural fields, implicit representations, and differentiable rendering are all active areas of research that continue to push boundaries.
My goal is to help develop and record methods that bring us closer to seamless, high-fidelity simulations of the world around us — not just for entertainment or virtual reality, but for science, education, and communication. I believe that when we can render the world as we see it, and understand the world as we simulate it, we unlock entirely new ways of knowing and creating.
That idea, first mentioned by a professor in a lecture hall years ago, has become the foundation of my research life. And I’m still just as curious as I was back then.
//