kelseye commited on
Commit
1d7b7c0
·
verified ·
1 Parent(s): 9165669

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ _cover_images_/cover_video.mp4 filter=lfs diff=lfs merge=lfs -text
37
+ assets/-0.5.mp4 filter=lfs diff=lfs merge=lfs -text
38
+ assets/0.7.mp4 filter=lfs diff=lfs merge=lfs -text
39
+ assets/0.mp4 filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ # Wanxiao 2.1-1.3B-LoRA-Speed-Control-v1
5
+
6
+ ## Model Introduction
7
+
8
+ This LoRA model is trained based on the [Wanxiao 2.1-1.3B](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B) model and the [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) framework. This model allows control over video speed by adjusting the LoRA alpha parameter.
9
+
10
+ * **LoRA alpha > 0**: Use the trigger word "low speed" to slow down the video speed and enhance visual quality.
11
+ * **LoRA alpha < 0**: Use the trigger word "high speed" to speed up the video and reduce visual quality.
12
+
13
+ ## Model Performance
14
+
15
+ Prompt: Documentary photography style, a lively white puppy rapidly running on a lush green lawn. The puppy has pure white fur, erect ears, and an expression of focused joy. Sunlight shines on its body, making the fur appear exceptionally soft and shiny. The background features an open grassland dotted with occasional wildflowers, with a faint view of blue sky and scattered clouds in the distance. Strong perspective emphasizes the dynamic motion of the running puppy and the vitality of the surrounding grass. Medium shot with a side-moving viewpoint.
16
+
17
+ Negative prompt: Vivid colors, overexposure, static, blurry details, subtitles, style, artwork, painting, stillness, overall grayish tone, worst quality, low quality, JPEG compression artifacts, ugly, defective, extra fingers, poorly drawn hands, poorly drawn face, deformed limbs, fused fingers, motionless frames, cluttered background, three legs, crowded background, walking backward.
18
+
19
+ LoRA alpha = 0.7
20
+
21
+ <div align="center"><video width="80%" controls><source src="assets/0.7.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
22
+
23
+ LoRA alpha = 0
24
+
25
+ <div align="center"><video width="80%" controls><source src="assets/0.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
26
+
27
+ LoRA alpha = -0.5
28
+
29
+ <div align="center"><video width="80%" controls><source src="assets/-0.5.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
30
+
31
+ ## Usage Instructions
32
+
33
+ This model is built on the [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) framework. Please install it first:
34
+
35
+ ```
36
+ pip install diffsynth
37
+ ```
38
+
39
+
40
+ ```python
41
+ import torch
42
+ from diffsynth import ModelManager, WanVideoPipeline, save_video
43
+ from modelscope import snapshot_download
44
+ ```
45
+
46
+ ```python
47
+ snapshot_download(
48
+ model_id="DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1",
49
+ local_dir="models/DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1",
50
+ allow_file_pattern="*.safetensors"
51
+ )
52
+ model_manager = ModelManager(device="cpu")
53
+ model_manager.load_models(
54
+ [
55
+ "models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors",
56
+ "models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth",
57
+ "models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth",
58
+ ],
59
+ torch_dtype=torch.bfloat16,
60
+ )
61
+ model_manager.load_lora("models/DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1/model.safetensors", lora_alpha=0.7)
62
+ pipe = WanVideoPipeline.from_model_manager(model_manager, torch_dtype=torch.bfloat16, device="cuda")
63
+ pipe.enable_vram_management(num_persistent_param_in_dit=None)
64
+
65
+ video = pipe(
66
+ prompt="low speed, documentary photography style, a lively white puppy rapidly running on a lush green grassy field. The puppy has snow-white fur, upright ears, and an expression of focus and joy. Sunlight shines on its body, making the fur appear exceptionally soft and shiny. The background features an open grassland, occasionally dotted with wildflowers, with a faint view of blue sky and scattered clouds in the distance. Strong sense of perspective captures the dynamic motion of the puppy and the vitality of the surrounding grass. Mid-shot side-moving perspective.",
67
+ negative_prompt="vivid colors, overexposed, static, blurry details, subtitles, style, artwork, painting, frame, stillness, overall grayish tone, worst quality, low quality, JPEG compression artifacts, ugly, deformed, extra fingers, poorly drawn hands, poorly drawn face, malformed limbs, fused fingers, motionless frame, cluttered background, three legs, crowded background, walking backwards",
68
+ num_inference_steps=50,
69
+ seed=0, tiled=True,
70
+ num_frames=33, height=1024, width=1024, sigma_shift=10,
71
+ )
72
+ save_video(video, "video.mp4", fps=15, quality=5)
73
+ ```
README_from_modelscope.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: MusePublic/wan2.1-1.3b@v1
3
+ cover_images:
4
+ - _cover_images_/cover_video.mp4
5
+ frameworks:
6
+ - Pytorch
7
+ license: Apache License 2.0
8
+ tags:
9
+ - LoRA
10
+ - text2video generation
11
+ tasks:
12
+ - text-to-video-synthesis
13
+
14
+ trigger_words:
15
+ - "low speed"
16
+
17
+ vision_foundation: WAN_VIDEO_2_1_T2V_1_3_B
18
+ ---
19
+
20
+ # 通义万相2.1-1.3B-LoRA-速度控制-v1
21
+
22
+ ## 模型介绍
23
+
24
+ 本 LoRA 模型是基于模型[通义万相2.1-1.3B](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B)和框架 [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) 训练的LoRA。本模型可以通过调整 LoRA alpha 参数控制视频的速度。
25
+
26
+ * **LoRA alpha > 0**: 使用触发词 low speed,速度变慢,画质增强
27
+ * **LoRA alpha < 0**: 使用触发词 high speed,速度变快,画质降低
28
+
29
+ ## 模型效果
30
+
31
+ 提示词:纪实摄影风格画面,一只活泼的白色小狗在绿茵茵的草地上迅速奔跑。小狗毛色雪白,两只耳朵立起,神情专注而欢快。阳光洒在它身上,使得毛发看上去格外柔软而闪亮。背景是一片开阔的草地,偶尔点缀着几朵野花,远处隐约可见蓝天和几片白云。透视感鲜明,捕捉小狗奔跑时的动感和四周草地的生机。中景侧面移动视角。
32
+
33
+ 负面提示词:色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走
34
+
35
+ LoRA alpha = 0.7
36
+
37
+ <div align="center"><video width="80%" controls><source src="assets/0.7.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
38
+
39
+ LoRA alpha = 0
40
+
41
+ <div align="center"><video width="80%" controls><source src="assets/0.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
42
+
43
+ LoRA alpha = -0.5
44
+
45
+ <div align="center"><video width="80%" controls><source src="assets/-0.5.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
46
+
47
+ ## 使用说明
48
+
49
+ 本模型基于框架 [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) 训练,请先安装
50
+
51
+ ```
52
+ pip install diffsynth
53
+ ```
54
+
55
+
56
+ ```python
57
+ import torch
58
+ from diffsynth import ModelManager, WanVideoPipeline, save_video
59
+ from modelscope import snapshot_download
60
+
61
+
62
+ snapshot_download(
63
+ model_id="DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1",
64
+ local_dir="models/DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1",
65
+ allow_file_pattern="*.safetensors"
66
+ )
67
+ model_manager = ModelManager(device="cpu")
68
+ model_manager.load_models(
69
+ [
70
+ "models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors",
71
+ "models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth",
72
+ "models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth",
73
+ ],
74
+ torch_dtype=torch.bfloat16,
75
+ )
76
+ model_manager.load_lora("models/DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1/model.safetensors", lora_alpha=0.7)
77
+ pipe = WanVideoPipeline.from_model_manager(model_manager, torch_dtype=torch.bfloat16, device="cuda")
78
+ pipe.enable_vram_management(num_persistent_param_in_dit=None)
79
+
80
+ video = pipe(
81
+ prompt="low speed, 纪实摄影风格画面,一只活泼的白色小狗在绿茵茵的草地上迅速奔跑。小狗毛色雪白,两只耳朵立起,神情专注而欢快。阳光洒在它身上,使得毛发看上去格外柔软而闪亮。背景是一片开阔的草地,偶尔点缀着几朵野花,远处隐约可见蓝天和几片白云。透视感鲜明,捕捉小狗奔跑时的动感和四周草地的生机。中景侧面移动视角。",
82
+ negative_prompt="色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走",
83
+ num_inference_steps=50,
84
+ seed=0, tiled=True,
85
+ num_frames=33, height=1024, width=1024, sigma_shift=10,
86
+ )
87
+ save_video(video, "video.mp4", fps=15, quality=5)
88
+ ```
_cover_images_/cover_video.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:83a11038f4c35396d192d01fd107d8f5a67ece23c837030648f915888de5d862
3
+ size 1673633
assets/-0.5.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8b84b687781eb150301ab9dccf84f9dbaef30b18f0e14e7fd61cfe0f19f4b48f
3
+ size 598126
assets/0.7.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4cc75702c68b1f3df1d0c0c53fa23851543bef102a450bf565fba9a2a10f3e83
3
+ size 577802
assets/0.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6eb0330699cae3de4c43934f8591e4f021b3f37ddc18dad7257054e19a5ad3d
3
+ size 697874
configuration.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "aigc_model": true,
3
+ "model_file_location": "model.safetensors",
4
+ "framework": "Pytorch",
5
+ "task": "other"
6
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5035805659a2b6180f95cec0521a860ad3fae01d9887f151db6167765b999806
3
+ size 87558728