Recently, researchers announced two new research results that use artificial intelligence to generate three-dimensional avatars from text, enabling applications such as virtual fitting and avatar shape editing. These results come from researchers at Germany's Max Planck Institute and other institutions and are published on arXiv.
The first study proposes a method called DELTA that can create three-dimensional avatars with independent body and clothing/hair layers. The researchers used different 3D representation methods to model the body and clothes/hair separately, creating avatars from monocular RGB videos. This decomposition enables applications such as virtual fitting and shape editing, where clothes and hair can be easily converted between different body types.
The second study uses stable diffusion and DELTA hybrid 3D representation to propose a text-to-avatar method called TECA. This method can generate high-quality avatars from text descriptions only and enable powerful attribute editing. The system first uses stable diffusion to generate facial images as a reference, and then adds hair, clothes and other elements in sequence. The researchers said that the quality of the synthetic avatars generated by this method is significantly improved, and the attribute transfer enables powerful editing capabilities.
These two studies provide new ideas for digital human generation. Using artificial intelligence algorithms to deconstruct the different components of digital humans can not only create realistic three-dimensional virtual images, but also support applications such as online virtual fittings, which will have a profound impact on fashion e-commerce, social platforms and the metaverse.