# Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis | ICLR 2024 Spotlight [![arXiv](https://img.shields.io/badge/arXiv-Paper-%3CCOLOR%3E.svg)](https://arxiv.org/abs/2401.08503)| [![GitHub Stars](https://img.shields.io/github/stars/yerfor/Real3DPortrait )](https://github.com/yerfor/Real3DPortrait) | [中文文档](./README-zh.md) This is the official repo of Real3D-Portrait with Pytorch implementation, for one-shot and high video reality talking portrait synthesis. You can visit our [Demo Page](https://real3dportrait.github.io/) for watching demo videos, and read our [Paper](https://arxiv.org/pdf/2401.08503.pdf) for technical details.



# Quick Start! ## Environment Installation Please refer to [Installation Guide](docs/prepare_env/install_guide.md), prepare a Conda environment `real3dportrait`. ## Download Pre-trained & Third-Party Models ### 3DMM BFM Model Download 3DMM BFM Model from [Google Drive](https://drive.google.com/drive/folders/1o4t5YIw7w4cMUN4bgU9nPf6IyWVG1bEk?usp=sharing) or [BaiduYun Disk](https://pan.baidu.com/s/1aqv1z_qZ23Vp2VP4uxxblQ?pwd=m9q5 ) with Password m9q5. Put all the files in `deep_3drecon/BFM`, the file structure will be like this: ``` deep_3drecon/BFM/ ├── 01_MorphableModel.mat ├── BFM_exp_idx.mat ├── BFM_front_idx.mat ├── BFM_model_front.mat ├── Exp_Pca.bin ├── facemodel_info.mat ├── index_mp468_from_mesh35709.npy ├── mediapipe_in_bfm53201.npy └── std_exp.txt ``` ### Pre-trained Real3D-Portrait Download Pre-trained Real3D-Portrait:[Google Drive](https://drive.google.com/drive/folders/1MAveJf7RvJ-Opg1f5qhLdoRoC_Gc6nD9?usp=sharing) or [BaiduYun Disk](https://pan.baidu.com/s/1Mjmbn0UtA1Zm9owZ7zWNgQ?pwd=6x4f ) with Password 6x4f Put the zip files in `checkpoints` and unzip them, the file structure will be like this: ``` checkpoints/ ├── 240126_real3dportrait_orig │ ├── audio2secc_vae │ │ ├── config.yaml │ │ └── model_ckpt_steps_400000.ckpt │ └── secc2plane_torso_orig │ ├── config.yaml │ └── model_ckpt_steps_100000.ckpt └── pretrained_ckpts └── mit_b0.pth ``` ## Inference Currently, we provide **CLI** and **Gradio WebUI** for inference, and Google Colab will be provided in the future. We support both Audio-Driven and Video-Driven methods: - For audio-driven, at least prepare `source image` and `driving audio` - For video-driven, at least prepare `source image` and `driving expression video` ### Gradio WebUI Run Gradio WebUI demo, upload resouces in webpage,click `Generate` button to inference: ```bash python inference/app_real3dportrait.py ``` ### CLI Inference Firstly, switch to project folder and activate conda environment: ```bash cd conda activate real3dportrait export PYTHON_PATH=./ ``` For audio-driven, provide source image and driving audio: ```bash python inference/real3d_infer.py \ --src_img \ --drv_aud \ --drv_pose \ --bg_img \ --out_name ``` For video-driven, provide source image and driving expression video(as `--drv_aud` parameter): ```bash python inference/real3d_infer.py \ --src_img \ --drv_aud \ --drv_pose \ --bg_img \ --out_name ``` Some optional parameters: - `--drv_pose` provide motion pose information, default to be static poses - `--bg_img` provide background information, default to be image extracted from source - `--mouth_amp` mouth amplitude, higher value leads to wider mouth - `--map_to_init_pose` when set to `True`, the initial pose will be mapped to source pose, and other poses will be equally transformed - `--temperature` stands for the sampling temperature of audio2motion, higher for more diverse results at the expense of lower accuracy - `--out_name` When not assigned, the results will be stored at `infer_out/tmp/`. - `--out_mode` When `final`, only outputs the final result; when `concat_debug`, also outputs visualization of several intermediate process. Commandline example: ```bash python inference/real3d_infer.py \ --src_img data/raw/examples/Macron.png \ --drv_aud data/raw/examples/Obama_5s.wav \ --drv_pose data/raw/examples/May_5s.mp4 \ --bg_img data/raw/examples/bg.png \ --out_name output.mp4 \ --out_mode concat_debug ``` # ToDo - [x] **Release Pre-trained weights of Real3D-Portrait.** - [x] **Release Inference Code of Real3D-Portrait.** - [x] **Release Gradio Demo of Real3D-Portrait..** - [ ] **Release Google Colab of Real3D-Portrait..** - [ ] **Release Training Code of Real3D-Portrait.** # Citation If you found this repo helpful to your work, please consider cite us: ``` @article{ye2024real3d, title={Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis}, author={Ye, Zhenhui and Zhong, Tianyun and Ren, Yi and Yang, Jiaqi and Li, Weichuang and Huang, Jiawei and Jiang, Ziyue and He, Jinzheng and Huang, Rongjie and Liu, Jinglin and others}, journal={arXiv preprint arXiv:2401.08503}, year={2024} } @article{ye2023geneface++, title={GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation}, author={Ye, Zhenhui and He, Jinzheng and Jiang, Ziyue and Huang, Rongjie and Huang, Jiawei and Liu, Jinglin and Ren, Yi and Yin, Xiang and Ma, Zejun and Zhao, Zhou}, journal={arXiv preprint arXiv:2305.00787}, year={2023} } @article{ye2023geneface, title={GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis}, author={Ye, Zhenhui and Jiang, Ziyue and Ren, Yi and Liu, Jinglin and He, Jinzheng and Zhao, Zhou}, journal={arXiv preprint arXiv:2301.13430}, year={2023} } ```