Quantized GGUF version of Bernini-R for ComfyUI.

Original model link: https://huggingface.co/ByteDance/Bernini-R

Watch us on Youtube: @VantageWithAI

Bernini

Latent Semantic Planning for Video Diffusion

Chenchen Liu*, Junyi Chen*, Lei Li*, Lu Chi*,ยง, Mingzhen Sun*, Zhuoying Li*, Yi Fu, Ruoyu Guo, Yiheng Wu, Ge Bai, Zehuan Yuanโœ‰

* Equal contribution  โœ‰ Corresponding author  ยง Project lead

arXiv Project Page HuggingFace

๐ŸŽ‰ News

โœจ Highlights

Bernini is a unified framework for video generation and editing that combines an MLLM-based semantic planner with a DiT-based renderer.

On video editing, Bernini reaches the first tier among leading closed-source commercial models. The leaderboard below comes from our self-built arena platform, where human annotators blindly vote on paired edits and the votes are aggregated into a Bradley-Terry score and a pairwise win-rate matrix.

Video editing arena: Bradley-Terry leaderboard and pairwise win-rate matrix

๐Ÿ“‘ Citation

If you use Bernini in your research, please cite:

@article{bernini,
  title   = {Bernini: Latent Semantic Planning for Video Diffusion},
  author  = {Chenchen Liu and Junyi Chen and Lei Li and Lu Chi and Mingzhen Sun and Zhuoying Li and Yi Fu and Ruoyu Guo and Yiheng Wu and Ge Bai and Zehuan Yuan},
  journal = {arXiv preprint arXiv:2605.22344},
  year    = {2026}
}

๐Ÿ™ Acknowledgements

Bernini builds on several outstanding open-source projects:

We thank the authors and communities of these projects for their contributions.

๐Ÿ“„ License

Apache License 2.0. See LICENSE.

Downloads last month
1,234
GGUF
Model size
14B params
Architecture
wan
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for vantagewithai/Bernini-R-GGUF-ComfyUI

Quantized
(6)
this model

Paper for vantagewithai/Bernini-R-GGUF-ComfyUI