hpcai-tech/OpenSora-STDiT-v1-HQ-16x256x256

Open-Sora: Democratizing Efficient Video Production for All

We present Open-Sora, an initiative dedicated to efficiently produce high-quality video and make the model, tools and contents accessible to all. By embracing open-source principles, Open-Sora not only democratizes access to advanced video generation techniques, but also offers a streamlined and user-friendly platform that simplifies the complexities of video production. With Open-Sora, we aim to inspire innovation, creativity, and inclusivity in the realm of content creation.

Open-Sora is still at an early stage and under active development.

More details can be founded at Open-Sora GitHub.

📰 News

[2024.03.18] 🔥 We release Open-Sora 1.0, a fully open-source project for video generation. Open-Sora 1.0 supports a full pipeline of video data preprocessing, training with ColossalAI acceleration, inference, and more. Our provided checkpoints can produce 2s 512x512 videos with only 3 days training. [blog]
[2024.03.04] Open-Sora provides training with 46% cost reduction. [blog]

🛠 Usage

You can launch this video generation with this model in a Gradio application.

# git clone Open-Sora
git clone https://github.com/hpcaitech/Open-Sora.git
cd Open-Sora

# launch gradio
python scripts/demo.py --model-type v1-HQ-16x256x256

If you want to use this STDiT model in code,

from transformers import AutoModel

stdit = AutoModel.from_pretrained("hpcai-tech/OpenSora-STDiT-v1-HQ-16x256x256")

Do note that this model alone cannot generate video, it should work alongside a vae model and a text encoder model like how we did in the demo.