Multi-View Stereo on Consistent Face Topology

2017 - Present
Project Leader: Graham Fyffe

Download a PDF overview.

We present a multi-view stereo reconstruction technique that directly produces a complete high-fidelity head model with consistent facial mesh topology. While existing techniques decouple shape estimation and facial tracking, our framework jointly optimizes for stereo constraints and consistent mesh parameterization. Our method is therefore free from drift and fully parallelizable for dynamic facial performance capture. We produce highly detailed facial geometries with artist-quality UV parameterization, including secondary elements such as eyeballs, mouth pockets, nostrils, and the back of the head. Our approach consists of deforming a common template model to match multi-view input images of the subject, while satisfying cross-view, cross-subject, and cross-pose consistencies using a combination of 2D landmark detection, optical flow, and surface and volumetric Laplacian regularization. Since the flow is never computed between frames, our method is trivially parallelized by processing each frame independently. Accurate rigid head pose is extracted using a PCA-based dimension reduction and denoising scheme. We demonstrate high-fidelity performance capture results with challenging head motion and complex facial expressions around eye and mouth regions. While the quality of our results is on par with the current state-of-the-art, our approach can be fully parallelized, does not suffer from drift, and produces face models with production-quality mesh topologies.

Our objective is to warp a common template model to a different person in arbitrary poses and different expressions while ensuring consistent anatomical matches between subjects and accurate tracking across frames. The key challenge is to handle the large variations of facial appearances and geometries, as well as the complexity of facial expression and large deformations. We propose an appearance-driven mesh deformation approach that produces intermediate warped photographs for reliable and accurate optical flow computation. Our approach effectively avoids image discontinuities and artifacts often caused by methods based on synthetic renderings or texture reprojection.