BAAI
/

nova-d48w1024-osp480

video-generation

Model card Files Files and versions Community

PhyscalX commited on 5 days ago

Commit

1e66007

•

1 Parent(s): 9487909

Update model type

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -10,11 +10,11 @@ tags:
 ## Model Details
 - **Developed by:** BAAI
-- **Model type:** Masked Autoregressive Text-to-Video Generation Model
 - **Model size:** 645M
 - **Model precision:** torch.float16 (FP16)
 - **Model resolution:** 768x480
-- **Model Description:** This is a model that can be used to generate and modify videos based on text prompts. It is a [Masked Autoregressive (MAR)](https://arxiv.org/abs/2406.11838) diffusion model that uses a pretrained text encoder ([Phi-2](https://huggingface.co/microsoft/phi-2)) and one VAE video tokenizer ([OpenSoraPlanV1.2-VAE](https://huggingface.co/LanguageBind/Open-Sora-Plan-v1.2.0)).
 - **Model License:** [Apache 2.0 License](LICENSE)
 - **Resources for more information:** [GitHub Repository](https://github.com/baaivision/NOVA).

 ## Model Details
 - **Developed by:** BAAI
+- **Model type:** Non-quantized Autoregressive Text-to-Video Generation Model
 - **Model size:** 645M
 - **Model precision:** torch.float16 (FP16)
 - **Model resolution:** 768x480
+- **Model Description:** This is a model that can be used to generate and modify videos based on text prompts. It is a [Non-quantized Video Autoregressive (NOVA)](https://arxiv.org/abs/2412.14169) diffusion model that uses a pretrained text encoder ([Phi-2](https://huggingface.co/microsoft/phi-2)) and one VAE video tokenizer ([OpenSoraPlanV1.2-VAE](https://huggingface.co/LanguageBind/Open-Sora-Plan-v1.2.0)).
 - **Model License:** [Apache 2.0 License](LICENSE)
 - **Resources for more information:** [GitHub Repository](https://github.com/baaivision/NOVA).