# AnimateDiff: training and inference setup ## Setups for Inference ### Prepare Environment ***We updated our inference code with xformers and a sequential decoding trick. Now AnimateDiff takes only ~12GB VRAM to inference, and run on a single RTX3090 !!*** ``` git clone https://github.com/guoyww/AnimateDiff.git cd AnimateDiff conda env create -f environment.yaml conda activate animatediff ``` ### Download Base T2I & Motion Module Checkpoints We provide two versions of our Motion Module, which are trained on stable-diffusion-v1-4 and finetuned on v1-5 seperately. It's recommanded to try both of them for best results. ``` git lfs install git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/ bash download_bashscripts/0-MotionModule.sh ``` You may also directly download the motion module checkpoints from [Google Drive](https://drive.google.com/drive/folders/1EqLC65eR1-W-sGD0Im7fkED6c8GkiNFI?usp=sharing) / [HuggingFace](https://huggingface.co/guoyww/animatediff) / [CivitAI](https://civitai.com/models/108836/animatediff-motion-modules), then put them in `models/Motion_Module/` folder. ### Prepare Personalize T2I Here we provide inference configs for 6 demo T2I on CivitAI. You may run the following bash scripts to download these checkpoints. ``` bash download_bashscripts/1-ToonYou.sh bash download_bashscripts/2-Lyriel.sh bash download_bashscripts/3-RcnzCartoon.sh bash download_bashscripts/4-MajicMix.sh bash download_bashscripts/5-RealisticVision.sh bash download_bashscripts/6-Tusun.sh bash download_bashscripts/7-FilmVelvia.sh bash download_bashscripts/8-GhibliBackground.sh ``` ### Inference After downloading the above peronalized T2I checkpoints, run the following commands to generate animations. The results will automatically be saved to `samples/` folder. ``` python -m scripts.animate --config configs/prompts/1-ToonYou.yaml python -m scripts.animate --config configs/prompts/2-Lyriel.yaml python -m scripts.animate --config configs/prompts/3-RcnzCartoon.yaml python -m scripts.animate --config configs/prompts/4-MajicMix.yaml python -m scripts.animate --config configs/prompts/5-RealisticVision.yaml python -m scripts.animate --config configs/prompts/6-Tusun.yaml python -m scripts.animate --config configs/prompts/7-FilmVelvia.yaml python -m scripts.animate --config configs/prompts/8-GhibliBackground.yaml ``` To generate animations with a new DreamBooth/LoRA model, you may create a new config `.yaml` file in the following format: ``` NewModel: inference_config: "[path to motion module config file]" motion_module: - "models/Motion_Module/mm_sd_v14.ckpt" - "models/Motion_Module/mm_sd_v15.ckpt" motion_module_lora_configs: - path: "[path to MotionLoRA model]" alpha: 1.0 - ... dreambooth_path: "[path to your DreamBooth model .safetensors file]" lora_model_path: "[path to your LoRA model .safetensors file, leave it empty string if not needed]" steps: 25 guidance_scale: 7.5 prompt: - "[positive prompt]" n_prompt: - "[negative prompt]" ``` Then run the following commands: ``` python -m scripts.animate --config [path to the config file] ``` ## Steps for Training ### Dataset Before training, download the videos files and the `.csv` annotations of [WebVid10M](https://maxbain.com/webvid-dataset/) to the local mechine. Note that our examplar training script requires all the videos to be saved in a single folder. You may change this by modifying `animatediff/data/dataset.py`. ### Configuration After dataset preparations, update the below data paths in the config `.yaml` files in `configs/training/` folder: ``` train_data: csv_path: [Replace with .csv Annotation File Path] video_folder: [Replace with Video Folder Path] sample_size: 256 ``` Other training parameters (lr, epochs, validation settings, etc.) are also included in the config files. ### Training To train motion modules ``` torchrun --nnodes=1 --nproc_per_node=1 train.py --config configs/training/training.yaml ``` To finetune the unet's image layers ``` torchrun --nnodes=1 --nproc_per_node=1 train.py --config configs/training/image_finetune.yaml ```