zR
commited on
Commit
•
4bbfb1d
1
Parent(s):
0d6a353
update GPU memory to 24GB
Browse files- README.md +14 -13
- README_zh.md +14 -13
README.md
CHANGED
@@ -88,18 +88,17 @@ inference: false
|
|
88 |
CogVideoX is an open-source video generation model that shares the same origins as [清影](https://chatglm.cn/video).
|
89 |
The table below provides a list of the video generation models we currently offer, along with their basic information.
|
90 |
|
91 |
-
| Model Name
|
92 |
-
|
93 |
-
|
|
94 |
-
| GPU
|
95 |
-
|
|
96 |
-
|
|
97 |
-
|
|
98 |
-
|
|
99 |
-
|
|
100 |
-
|
|
101 |
-
| Quantized Inference
|
102 |
-
| Multi-card Inference | Not Supported |
|
103 |
|
104 |
**Note** Using [SAT](https://github.com/THUDM/SwissArmyTransformer) model cost 18GB for inference. Check our github.
|
105 |
|
@@ -128,7 +127,9 @@ prompt = "A panda, dressed in a small, red jacket and a tiny hat, sits on a wood
|
|
128 |
pipe = CogVideoXPipeline.from_pretrained(
|
129 |
"THUDM/CogVideoX-2b",
|
130 |
torch_dtype=torch.float16
|
131 |
-
)
|
|
|
|
|
132 |
|
133 |
prompt_embeds, _ = pipe.encode_prompt(
|
134 |
prompt=prompt,
|
|
|
88 |
CogVideoX is an open-source video generation model that shares the same origins as [清影](https://chatglm.cn/video).
|
89 |
The table below provides a list of the video generation models we currently offer, along with their basic information.
|
90 |
|
91 |
+
| Model Name | CogVideoX-2B |
|
92 |
+
|-------------------------------------------|--------------------------------------|
|
93 |
+
| Prompt Language | English |
|
94 |
+
| Single GPU Inference (FP16) | 23.9GB |
|
95 |
+
| Multi GPUs Inference (FP16) | 20GB minimum per GPU using diffusers |
|
96 |
+
| GPU Memory Required for Fine-tuning(bs=1) | 40GB |
|
97 |
+
| Prompt Max Length | 226 Tokens |
|
98 |
+
| Video Length | 6 seconds |
|
99 |
+
| Frames Per Second | 8 frames |
|
100 |
+
| Resolution | 720 * 480 |
|
101 |
+
| Quantized Inference | Not Supported |
|
|
|
102 |
|
103 |
**Note** Using [SAT](https://github.com/THUDM/SwissArmyTransformer) model cost 18GB for inference. Check our github.
|
104 |
|
|
|
127 |
pipe = CogVideoXPipeline.from_pretrained(
|
128 |
"THUDM/CogVideoX-2b",
|
129 |
torch_dtype=torch.float16
|
130 |
+
)
|
131 |
+
|
132 |
+
pipe.enable_model_cpu_offload()
|
133 |
|
134 |
prompt_embeds, _ = pipe.encode_prompt(
|
135 |
prompt=prompt,
|
README_zh.md
CHANGED
@@ -73,18 +73,17 @@
|
|
73 |
|
74 |
CogVideoX是 [清影](https://chatglm.cn/video) 同源的开源版本视频生成模型。下表战展示目前我们提供的视频生成模型列表,以及相关基础信息。
|
75 |
|
76 |
-
|
|
77 |
-
|
78 |
-
| 提示词语言
|
79 |
-
|
|
80 |
-
|
|
81 |
-
|
|
82 |
-
|
|
83 |
-
|
|
84 |
-
|
|
85 |
-
|
|
86 |
-
|
|
87 |
-
| 多卡推理 | 不支持 |
|
88 |
|
89 |
**Note** 使用 [SAT](https://github.com/THUDM/SwissArmyTransformer) 推理SAT版本模型仅需18G显存。欢迎前往我们的github查看。
|
90 |
|
@@ -112,7 +111,9 @@ prompt = "A panda, dressed in a small, red jacket and a tiny hat, sits on a wood
|
|
112 |
pipe = CogVideoXPipeline.from_pretrained(
|
113 |
"THUDM/CogVideoX-2b",
|
114 |
torch_dtype=torch.float16
|
115 |
-
)
|
|
|
|
|
116 |
|
117 |
prompt_embeds, _ = pipe.encode_prompt(
|
118 |
prompt=prompt,
|
|
|
73 |
|
74 |
CogVideoX是 [清影](https://chatglm.cn/video) 同源的开源版本视频生成模型。下表战展示目前我们提供的视频生成模型列表,以及相关基础信息。
|
75 |
|
76 |
+
| 模型名 | CogVideoX-2B |
|
77 |
+
|---------------------|--------------------------------------|
|
78 |
+
| 提示词语言 | English |
|
79 |
+
| 单GPU推理 (FP-16) 显存消耗 | 23.9GB |
|
80 |
+
| 多GPU推理 (FP-16) 显存消耗 | 20GB minimum per GPU using diffusers |
|
81 |
+
| 微调显存消耗 (bs=1) | 42GB |
|
82 |
+
| 提示词长度上限 | 226 Tokens |
|
83 |
+
| 视频长度 | 6 seconds |
|
84 |
+
| 帧率(每秒) | 8 frames |
|
85 |
+
| 视频分辨率 | 720 * 480 |
|
86 |
+
| 量化推理 | 不支持 |
|
|
|
87 |
|
88 |
**Note** 使用 [SAT](https://github.com/THUDM/SwissArmyTransformer) 推理SAT版本模型仅需18G显存。欢迎前往我们的github查看。
|
89 |
|
|
|
111 |
pipe = CogVideoXPipeline.from_pretrained(
|
112 |
"THUDM/CogVideoX-2b",
|
113 |
torch_dtype=torch.float16
|
114 |
+
)
|
115 |
+
|
116 |
+
pipe.enable_model_cpu_offload()
|
117 |
|
118 |
prompt_embeds, _ = pipe.encode_prompt(
|
119 |
prompt=prompt,
|