Gaussian Head Avatar: High Quality Head Avatar Generator

Written By: kinza.sabir
Last Updated On: February 3, 2024

Discover the art of self-expression in pixels with avatar generation wizardry! Gaussian Head Avatar—design a visual identity that mirrors your expression, essence, sparking curiosity and connection in the online world. The researchers from NNKosmos presented this model.

Gaussians Head Avatar are used to create a head avatar model. In this process, neutral 3D Gaussians are fine-tuned, along with a deformation field based on a Multilayer Perceptron (MLP), to accurately capture intricate facial expressions and details in the model.

Gaussians Head Avatar Working

Gaussian Head Avatar takes a video input and generates a high quality multi-view videos with control motion and expressions.

We all loves to connect and interact with the world through social media such as Facebook, Instagram etc. Through this model user can get numbers of comments and reaction on your content, wondering how? Because, through this model you can generate creative 3D avatar that reflects your personality, interest, facial expression or even your mood.

Employing deep neural networks enables current techniques to achieve rapid reconstruction, resulting in heightened geometric precision. 3D head avatar has been generated through various methods such as Neural Radiance Fields (NeRF), generative model, monocular videos etc. These methods showed impressive results without precise geometry

Recent methodologies bypass the reconstruction and tracking stages, opting to directly train top-tier NeRF-based head avatars. These studies have confirmed the adaptability of NeRF for both dense and sparse views, significantly simplifying the process of reconstructing head avatars.

Sneak Peak of Gaussian Head Avatar

Gaussian Head Avatar is a novel representation for head avatars. This approach utilizes controllable dynamic 3D Gaussians to model expressive human head avatars. The primary outcome of this modeling process is the generation of ultra high-fidelity synthesized images at 2K resolutions. It is a method that can create highly detailed and realistic images of human head avatars through the utilization of controllable Gaussian structures.

To capture intricate and dynamic high-frequency, this model is a fully trained deformation field applied to the 3D head Gaussians. This approach accurately represents highly intricate and exaggerated facial expressions, encompassing dynamic and high-frequency details.

An effective initialization strategy was designed that utilizes implicit representations to set the initial states for geometry and deformation. This results in a training process for the Gaussian Head Avatar that is both efficient and robust, ensuring smooth convergence.

Dataset and Evaluation

12 sets of data were used for the experiment, 10 of which are from NeRSemble and the other 2 are multi-view video data from HAvatar. For the 10 samples from NeRSemble, each set contains 2500 to 3000 frames, 16 cameras are distributed about 120 degrees in front, capturing 2K resolution video. The sequences marked with ”FREE” as the evaluation data for each sample, and the rest as the training data. For the 2 samples from HAvatar, each set contains 3000 frames, 8 cameras are distributed about 120 degrees in front and 4K resolution videos are collected simultaneously. The face area was cropped and resized to 2K resolution.

Gaussian Head Avatar enables the generation of synthetic portrait videos that possess the ability to spread misinformation, impact public opinions, and undermine trust in media sources.

Conclusion

Gaussian Head Avatar is a pioneering method for reconstructing head avatars. This technique utilizes dynamic 3D Gaussians controlled by a completely learned expression deformation. The experiments showcase that this model can generate ultra-high-fidelity images, capturing intricate and exaggerated expressions