pipeline_tag: text-to-video | |
license: other | |
license_name: tencent-hunyuan-community | |
license_link: LICENSE | |
<p align="center"> | |
<img src="assets/logo.jpg" height=30> | |
</p> | |
# FastHunyuan Model Card | |
## Model Details | |
FastHunyuan is an accelerated [HunyuanVideo](https://huggingface.co/tencent/HunyuanVideo) model. It can sample high quality videos with 6 diffusion steps. That brings around 8X speed up compared to the original HunyuanVideo with 50 steps. | |
- **Developed by**: [Hao AI Lab](https://hao-ai-lab.github.io/) | |
- **License**: tencent-hunyuan-community | |
- **Distilled from**: [HunyuanVideo](https://huggingface.co/tencent/HunyuanVideo) | |
- **Github Repository**: https://github.com/hao-ai-lab/FastVideo | |
## Usage | |
- Clone [Fastvideo](https://github.com/hao-ai-lab/FastVideo) repository and follow the inference instructions in the README. | |
- Alternatively, you can inference FastHunyuan using the official [Hunyuan Video repository](https://github.com/Tencent/HunyuanVideo) by **setting the shift to 17 and steps to 6**. | |
## Training details | |
FastHunyuan is consistency distillated on the [MixKit](https://huggingface.co/datasets/LanguageBind/Open-Sora-Plan-v1.1.0/tree/main) dataset with the following hyperparamters: | |
- Batch size: 16 | |
- Resulotion: 720x1280 | |
- Num of frames: 125 | |
- Train steps: 320 | |
- GPUs: 32 | |
- LR: 1e-6 | |
- Loss: huber | |
## Evaluation | |
We provide some qualitative comparison between FastHunyuan 6 step inference v.s. the original Hunyuan with 6 step inference: | |
| FastHunyuan 6 step | Hunyuan 6 step | | |
| --- | --- | | |
| ![FastHunyuan 6 step](assets/distilled/1.gif) | ![Hunyuan 6 step](assets/undistilled/1.gif) | | |
| ![FastHunyuan 6 step](assets/distilled/2.gif) | ![Hunyuan 6 step](assets/undistilled/2.gif) | | |
| ![FastHunyuan 6 step](assets/distilled/3.gif) | ![Hunyuan 6 step](assets/undistilled/3.gif) | | |
| ![FastHunyuan 6 step](assets/distilled/4.gif) | ![Hunyuan 6 step](assets/undistilled/4.gif) | | |