HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars

Abstract

TL;DR: Plug-and-play high-dimensional extension of 3DGS for dynamic scenes.

We introduce HyperGaussians, a novel extension of 3D Gaussian Splatting for high-quality animatable face avatars. Creating such detailed face avatars from videos is a challenging problem and has numerous applications in augmented and virtual reality. While tremendous successes have been achieved for static faces, animatable avatars from monocular videos still fall in the uncanny valley. The de facto standard, 3D Gaussian Splatting (3DGS), represents a face through a collection of 3D Gaussian primitives. 3DGS excels at rendering static faces, but the state-of-the-art still struggles with nonlinear deformations, complex lighting effects, and fine details. While most related works focus on predicting better Gaussian parameters from expression codes, we rethink the 3D Gaussian representation itself and how to make it more expressive. Our insights lead to a novel extension of 3D Gaussians to high-dimensional multivariate Gaussians, dubbed 'HyperGaussians'. The higher dimensionality increases expressivity through conditioning on a learnable local embedding. However, splatting HyperGaussians is computationally expensive because it requires inverting a high-dimensional covariance matrix. We solve this by re-parameterizing the covariance matrix, dubbed the 'inverse covariance trick'. This trick boosts the efficiency so that HyperGaussians can be seamlessly integrated into existing models. To demonstrate this, we plug in HyperGaussians into the state-of-the-art in fast monocular face avatars: FlashAvatar. Our evaluation on 19 subjects from 4 face datasets shows that HyperGaussians outperform 3DGS numerically and visually, particularly for high-frequency details like eyeglass frames, teeth, complex facial movements, and specular reflections.

Please use the links in the navigation bar to quickly jump to results.

We recommend using Chrome for playing the videos.

Method

We propose an extension to 3D Gaussians, dubbed HyperGaussians, and plug them into an existing method for face avatars, FlashAvatar. FlashAvatar modulates 3D Gaussian primitives with expression-dependent offsets. We make a single modification to the pipeline: We plug in HyperGaussians between the MLP output and the rasterization, which modifies the offsets in higher dimensions. Instead of directly predicting offsets , we predict a latent that conditions the HyperGaussians. Without any other modifications or hyperparameter tuning, this simple change enhances details in the final avatar and leads to a performance boost. This figure has been adapted from FlashAvatar.

Self-Reenactment

HyperGaussians achieve high-quality details for thin structures (glass frames and teeth in the first row), specular reflections (eyes in the second row), and gracefully handle complex deformations (mouth in the third row).

Ground Truth

Ours

FlashAvatar

Cross-Reenactment

HyperGaussians preserve fine details in the teeth and the overall shape of the subject.

Source

Ours

FlashAvatar

Uncertainty

HyperGaussians demonstrate an emerging property that arises naturally throughout training. Their conditional covariance matrices indicate the variance of each Gaussian across the different expressions of the training subject and can be intuitively interpreted as uncertainty. This agreement between the uncertainty estimates and what would intuitively be considered difficult regions is an inductive bias of our formulation and does not require explicit supervision.

Original RGB

Uncertainty Map

If you find this work useful, please consider citing:

@article{Serifi2025HyperGaussians,
  author={Gent Serifi and Marcel C. Bühler},
  title={HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars},
  journal={arXiv},
  volume={2507.02803},
  year={2025}
}