UPDATE

Added "cond-image-leakage" (CIL) versions of the 1024 and 512 models from https://huggingface.co/GraceZhao/DynamiCrafter-CIL-1024

https://github.com/thu-ml/cond-image-leakage

For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1

Doc / guide: https://huggingface.co/docs/hub/model-cards

{}

Bf16 safetensors versions of the DynamiCrafter models by Doubiiu: https://huggingface.co/Doubiiu

This is a video diffusion model that takes in a single or two still images as a conditioning
image and text prompt describing dynamics, and generates looping videos or interpolation from them.

Model Details

Model Description

DynamiCrafter, a (Text-)Image-to-Video/Image Animation approach, aims to generate
short video clips (~2 seconds) from a conditioning image and text prompt.

This model was trained to generate 16 video frames at a resolution of 320x512
given a context frame of the same resolution.

Developed by: CUHK & Tencent AI Lab
Funded by: CUHK & Tencent AI Lab
Model type: Generative frame interpolation and looping video generation
Finetuned from model: VideoCrafter1 (320x512)

Model Sources

For research purpose, we recommend our Github repository (https://github.com/Doubiiu/DynamiCrafter),
which includes the detailed implementations.

Repository: https://github.com/Doubiiu/DynamiCrafter
Paper: https://arxiv.org/abs/2310.12190

Uses

Direct Use

We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.

Limitations

The generated videos are relatively short (2 seconds, FPS=8).
The model cannot render legible text.
Faces and people in general may not be generated properly.
The autoencoding part of the model is lossy, resulting in slight flickering artifacts.

How to Get Started with the Model

Check out https://github.com/Doubiiu/DynamiCrafter