martintomov commited on
Commit
cbe5ff1
·
verified ·
1 Parent(s): df3f1e5

yaml config included

Browse files
Files changed (1) hide show
  1. README.md +63 -2
README.md CHANGED
@@ -25,8 +25,69 @@ This project demonstrates the fine-tuning of the **Mochi Text-to-Video** model u
25
 
26
  - **Model Base**: [genmo/mochi-1-preview](https://huggingface.co/genmo/mochi-1-preview)
27
  - **Fine-Tuning Dataset**: 23 short video clips of infinite zoom art style, and .txt descriptions
28
- - **Training Settings**: 37 frames
29
  - **Training Hardware**: H100 GPU
30
  - **Training Duration**: 2h
31
 
32
- <Gallery />
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
  - **Model Base**: [genmo/mochi-1-preview](https://huggingface.co/genmo/mochi-1-preview)
27
  - **Fine-Tuning Dataset**: 23 short video clips of infinite zoom art style, and .txt descriptions
 
28
  - **Training Hardware**: H100 GPU
29
  - **Training Duration**: 2h
30
 
31
+ <Gallery />
32
+
33
+ ## lora.yaml:
34
+ ```
35
+ init_checkpoint_path: /weights/dit.safetensors
36
+ checkpoint_dir: /finetunes/my_mochi_lora
37
+ train_data_dir: /videos_prepared
38
+ attention_mode: sdpa
39
+ single_video_mode: false # Useful for debugging whether your model can learn a single video
40
+
41
+ # You only need this if you're using wandb
42
+ wandb:
43
+ # project: mochi_1_lora
44
+ # name: ${checkpoint_dir}
45
+ # group: null
46
+
47
+ optimizer:
48
+ lr: 2e-4
49
+ weight_decay: 0.01
50
+
51
+ model:
52
+ type: lora
53
+ kwargs:
54
+ # Apply LoRA to the QKV projection and the output projection of the attention block.
55
+ qkv_proj_lora_rank: 16
56
+ qkv_proj_lora_alpha: 16
57
+ qkv_proj_lora_dropout: 0.
58
+ out_proj_lora_rank: 16
59
+ out_proj_lora_alpha: 16
60
+ out_proj_lora_dropout: 0.
61
+
62
+ training:
63
+ model_dtype: bf16
64
+ warmup_steps: 200
65
+ num_qkv_checkpoint: 48
66
+ num_ff_checkpoint: 48
67
+ num_post_attn_checkpoint: 48
68
+ num_steps: 2000
69
+ save_interval: 200
70
+ caption_dropout: 0.1
71
+ grad_clip: 0.0
72
+ save_safetensors: true
73
+
74
+ # Used for generating samples during training to monitor progress ...
75
+ sample:
76
+ interval: 200
77
+ output_dir: ${checkpoint_dir}/samples
78
+ decoder_path: /weights/decoder.safetensors
79
+ prompts:
80
+ - Human fingers pinching to zoom on an infinite zoom canvas, a vast desert landscape stretches into the horizon. At the center, a giant hourglass sits, its glass exterior glinting in the sunlight. The zoom begins within the hourglass, revealing cascading grains of sand, each grain transitioning into a crystalline snowflake, leading to a frozen tundra as the scene deepens further.
81
+ - Human fingers pinching to zoom on an infinite zoom canvas, a colossal tree rises from a lush forest, its bark covered with intricate carvings of stories. The zoom focuses on one carving, which transforms into a vibrant painting of a village. Zooming further, the village reveals bustling streets, where a single doorway becomes the entry to a glowing cosmos.
82
+ - Human fingers pinching to zoom on an infinite zoom canvas, a tranquil ocean surface reflects the twilight sky. The zoom begins within a whirlpool, diving into vibrant coral reefs teeming with marine life. A single pearl on the ocean floor becomes the focus, transitioning into a marble palace with intricate golden inlays as the zoom continues seamlessly.
83
+ - Human fingers pinching to zoom on an infinite zoom canvas, a glowing campfire crackles in a dense, dark forest. The zoom begins in the heart of the fire, revealing swirling embers that transition into galaxies of stars. The zoom then centers on a lone star, which transforms into a lantern hanging in a cozy mountain cabin, seamlessly revealing new layers.
84
+ - Human fingers pinching to zoom on an infinite zoom canvas, a detailed cityscape at night, illuminated by neon lights and bustling with activity. The zoom focuses on a lit billboard advertising a soda can, transitioning into the sparkling surface of the liquid. As the zoom deepens, microscopic bubbles transform into entire ecosystems of floating islands within the soda.
85
+ seed: 12345
86
+ kwargs:
87
+ height: 480
88
+ width: 848
89
+ num_frames: 37
90
+ num_inference_steps: 64
91
+ sigma_schedule_python_code: "linear_quadratic_schedule(64, 0.025)"
92
+ cfg_schedule_python_code: "[6.0] * 64"
93
+ ```