Upload folder using huggingface_hub
Browse files- .gitattributes +4 -0
- README.md +73 -0
- README_from_modelscope.md +88 -0
- _cover_images_/cover_video.mp4 +3 -0
- assets/-0.5.mp4 +3 -0
- assets/0.7.mp4 +3 -0
- assets/0.mp4 +3 -0
- configuration.json +6 -0
- model.safetensors +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
_cover_images_/cover_video.mp4 filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
assets/-0.5.mp4 filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
assets/0.7.mp4 filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
assets/0.mp4 filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,73 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
# Wanxiao 2.1-1.3B-LoRA-Speed-Control-v1
|
| 5 |
+
|
| 6 |
+
## Model Introduction
|
| 7 |
+
|
| 8 |
+
This LoRA model is trained based on the [Wanxiao 2.1-1.3B](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B) model and the [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) framework. This model allows control over video speed by adjusting the LoRA alpha parameter.
|
| 9 |
+
|
| 10 |
+
* **LoRA alpha > 0**: Use the trigger word "low speed" to slow down the video speed and enhance visual quality.
|
| 11 |
+
* **LoRA alpha < 0**: Use the trigger word "high speed" to speed up the video and reduce visual quality.
|
| 12 |
+
|
| 13 |
+
## Model Performance
|
| 14 |
+
|
| 15 |
+
Prompt: Documentary photography style, a lively white puppy rapidly running on a lush green lawn. The puppy has pure white fur, erect ears, and an expression of focused joy. Sunlight shines on its body, making the fur appear exceptionally soft and shiny. The background features an open grassland dotted with occasional wildflowers, with a faint view of blue sky and scattered clouds in the distance. Strong perspective emphasizes the dynamic motion of the running puppy and the vitality of the surrounding grass. Medium shot with a side-moving viewpoint.
|
| 16 |
+
|
| 17 |
+
Negative prompt: Vivid colors, overexposure, static, blurry details, subtitles, style, artwork, painting, stillness, overall grayish tone, worst quality, low quality, JPEG compression artifacts, ugly, defective, extra fingers, poorly drawn hands, poorly drawn face, deformed limbs, fused fingers, motionless frames, cluttered background, three legs, crowded background, walking backward.
|
| 18 |
+
|
| 19 |
+
LoRA alpha = 0.7
|
| 20 |
+
|
| 21 |
+
<div align="center"><video width="80%" controls><source src="assets/0.7.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
|
| 22 |
+
|
| 23 |
+
LoRA alpha = 0
|
| 24 |
+
|
| 25 |
+
<div align="center"><video width="80%" controls><source src="assets/0.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
|
| 26 |
+
|
| 27 |
+
LoRA alpha = -0.5
|
| 28 |
+
|
| 29 |
+
<div align="center"><video width="80%" controls><source src="assets/-0.5.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
|
| 30 |
+
|
| 31 |
+
## Usage Instructions
|
| 32 |
+
|
| 33 |
+
This model is built on the [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) framework. Please install it first:
|
| 34 |
+
|
| 35 |
+
```
|
| 36 |
+
pip install diffsynth
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
```python
|
| 41 |
+
import torch
|
| 42 |
+
from diffsynth import ModelManager, WanVideoPipeline, save_video
|
| 43 |
+
from modelscope import snapshot_download
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
```python
|
| 47 |
+
snapshot_download(
|
| 48 |
+
model_id="DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1",
|
| 49 |
+
local_dir="models/DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1",
|
| 50 |
+
allow_file_pattern="*.safetensors"
|
| 51 |
+
)
|
| 52 |
+
model_manager = ModelManager(device="cpu")
|
| 53 |
+
model_manager.load_models(
|
| 54 |
+
[
|
| 55 |
+
"models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors",
|
| 56 |
+
"models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth",
|
| 57 |
+
"models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth",
|
| 58 |
+
],
|
| 59 |
+
torch_dtype=torch.bfloat16,
|
| 60 |
+
)
|
| 61 |
+
model_manager.load_lora("models/DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1/model.safetensors", lora_alpha=0.7)
|
| 62 |
+
pipe = WanVideoPipeline.from_model_manager(model_manager, torch_dtype=torch.bfloat16, device="cuda")
|
| 63 |
+
pipe.enable_vram_management(num_persistent_param_in_dit=None)
|
| 64 |
+
|
| 65 |
+
video = pipe(
|
| 66 |
+
prompt="low speed, documentary photography style, a lively white puppy rapidly running on a lush green grassy field. The puppy has snow-white fur, upright ears, and an expression of focus and joy. Sunlight shines on its body, making the fur appear exceptionally soft and shiny. The background features an open grassland, occasionally dotted with wildflowers, with a faint view of blue sky and scattered clouds in the distance. Strong sense of perspective captures the dynamic motion of the puppy and the vitality of the surrounding grass. Mid-shot side-moving perspective.",
|
| 67 |
+
negative_prompt="vivid colors, overexposed, static, blurry details, subtitles, style, artwork, painting, frame, stillness, overall grayish tone, worst quality, low quality, JPEG compression artifacts, ugly, deformed, extra fingers, poorly drawn hands, poorly drawn face, malformed limbs, fused fingers, motionless frame, cluttered background, three legs, crowded background, walking backwards",
|
| 68 |
+
num_inference_steps=50,
|
| 69 |
+
seed=0, tiled=True,
|
| 70 |
+
num_frames=33, height=1024, width=1024, sigma_shift=10,
|
| 71 |
+
)
|
| 72 |
+
save_video(video, "video.mp4", fps=15, quality=5)
|
| 73 |
+
```
|
README_from_modelscope.md
ADDED
|
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
base_model: MusePublic/wan2.1-1.3b@v1
|
| 3 |
+
cover_images:
|
| 4 |
+
- _cover_images_/cover_video.mp4
|
| 5 |
+
frameworks:
|
| 6 |
+
- Pytorch
|
| 7 |
+
license: Apache License 2.0
|
| 8 |
+
tags:
|
| 9 |
+
- LoRA
|
| 10 |
+
- text2video generation
|
| 11 |
+
tasks:
|
| 12 |
+
- text-to-video-synthesis
|
| 13 |
+
|
| 14 |
+
trigger_words:
|
| 15 |
+
- "low speed"
|
| 16 |
+
|
| 17 |
+
vision_foundation: WAN_VIDEO_2_1_T2V_1_3_B
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
# 通义万相2.1-1.3B-LoRA-速度控制-v1
|
| 21 |
+
|
| 22 |
+
## 模型介绍
|
| 23 |
+
|
| 24 |
+
本 LoRA 模型是基于模型[通义万相2.1-1.3B](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B)和框架 [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) 训练的LoRA。本模型可以通过调整 LoRA alpha 参数控制视频的速度。
|
| 25 |
+
|
| 26 |
+
* **LoRA alpha > 0**: 使用触发词 low speed,速度变慢,画质增强
|
| 27 |
+
* **LoRA alpha < 0**: 使用触发词 high speed,速度变快,画质降低
|
| 28 |
+
|
| 29 |
+
## 模型效果
|
| 30 |
+
|
| 31 |
+
提示词:纪实摄影风格画面,一只活泼的白色小狗在绿茵茵的草地上迅速奔跑。小狗毛色雪白,两只耳朵立起,神情专注而欢快。阳光洒在它身上,使得毛发看上去格外柔软而闪亮。背景是一片开阔的草地,偶尔点缀着几朵野花,远处隐约可见蓝天和几片白云。透视感鲜明,捕捉小狗奔跑时的动感和四周草地的生机。中景侧面移动视角。
|
| 32 |
+
|
| 33 |
+
负面提示词:色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走
|
| 34 |
+
|
| 35 |
+
LoRA alpha = 0.7
|
| 36 |
+
|
| 37 |
+
<div align="center"><video width="80%" controls><source src="assets/0.7.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
|
| 38 |
+
|
| 39 |
+
LoRA alpha = 0
|
| 40 |
+
|
| 41 |
+
<div align="center"><video width="80%" controls><source src="assets/0.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
|
| 42 |
+
|
| 43 |
+
LoRA alpha = -0.5
|
| 44 |
+
|
| 45 |
+
<div align="center"><video width="80%" controls><source src="assets/-0.5.mp4" type="video/mp4">Your browser does not support the video tag.</video></div>
|
| 46 |
+
|
| 47 |
+
## 使用说明
|
| 48 |
+
|
| 49 |
+
本模型基于框架 [DiffSynth-Studio](https://github.com/modelscope/DiffSynth-Studio) 训练,请先安装
|
| 50 |
+
|
| 51 |
+
```
|
| 52 |
+
pip install diffsynth
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
```python
|
| 57 |
+
import torch
|
| 58 |
+
from diffsynth import ModelManager, WanVideoPipeline, save_video
|
| 59 |
+
from modelscope import snapshot_download
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
snapshot_download(
|
| 63 |
+
model_id="DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1",
|
| 64 |
+
local_dir="models/DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1",
|
| 65 |
+
allow_file_pattern="*.safetensors"
|
| 66 |
+
)
|
| 67 |
+
model_manager = ModelManager(device="cpu")
|
| 68 |
+
model_manager.load_models(
|
| 69 |
+
[
|
| 70 |
+
"models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors",
|
| 71 |
+
"models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth",
|
| 72 |
+
"models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth",
|
| 73 |
+
],
|
| 74 |
+
torch_dtype=torch.bfloat16,
|
| 75 |
+
)
|
| 76 |
+
model_manager.load_lora("models/DiffSynth-Studio/Wan2.1-1.3b-lora-speedcontrol-v1/model.safetensors", lora_alpha=0.7)
|
| 77 |
+
pipe = WanVideoPipeline.from_model_manager(model_manager, torch_dtype=torch.bfloat16, device="cuda")
|
| 78 |
+
pipe.enable_vram_management(num_persistent_param_in_dit=None)
|
| 79 |
+
|
| 80 |
+
video = pipe(
|
| 81 |
+
prompt="low speed, 纪实摄影风格画面,一只活泼的白色小狗在绿茵茵的草地上迅速奔跑。小狗毛色雪白,两只耳朵立起,神情专注而欢快。阳光洒在它身上,使得毛发看上去格外柔软而闪亮。背景是一片开阔的草地,偶尔点缀着几朵野花,远处隐约可见蓝天和几片白云。透视感鲜明,捕捉小狗奔跑时的动感和四周草地的生机。中景侧面移动视角。",
|
| 82 |
+
negative_prompt="色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走",
|
| 83 |
+
num_inference_steps=50,
|
| 84 |
+
seed=0, tiled=True,
|
| 85 |
+
num_frames=33, height=1024, width=1024, sigma_shift=10,
|
| 86 |
+
)
|
| 87 |
+
save_video(video, "video.mp4", fps=15, quality=5)
|
| 88 |
+
```
|
_cover_images_/cover_video.mp4
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:83a11038f4c35396d192d01fd107d8f5a67ece23c837030648f915888de5d862
|
| 3 |
+
size 1673633
|
assets/-0.5.mp4
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8b84b687781eb150301ab9dccf84f9dbaef30b18f0e14e7fd61cfe0f19f4b48f
|
| 3 |
+
size 598126
|
assets/0.7.mp4
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4cc75702c68b1f3df1d0c0c53fa23851543bef102a450bf565fba9a2a10f3e83
|
| 3 |
+
size 577802
|
assets/0.mp4
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a6eb0330699cae3de4c43934f8591e4f021b3f37ddc18dad7257054e19a5ad3d
|
| 3 |
+
size 697874
|
configuration.json
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"aigc_model": true,
|
| 3 |
+
"model_file_location": "model.safetensors",
|
| 4 |
+
"framework": "Pytorch",
|
| 5 |
+
"task": "other"
|
| 6 |
+
}
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5035805659a2b6180f95cec0521a860ad3fae01d9887f151db6167765b999806
|
| 3 |
+
size 87558728
|