Upload log.txt
Browse files
log.txt
ADDED
@@ -0,0 +1,328 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
python train_ddp_spawn.py --base configs/train-v01.yaml --no-test True --train True --logdir outputs/logs/train-v01
|
2 |
+
[2024-09-29 13:19:46,460] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
|
3 |
+
[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
|
4 |
+
[WARNING] async_io: please install the libaio-dev package with apt
|
5 |
+
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
|
6 |
+
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
|
7 |
+
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0
|
8 |
+
[WARNING] using untested triton version (2.0.0), only 1.0.0 is known to be compatible
|
9 |
+
2024-09-29 13:19:55.519392: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
|
10 |
+
2024-09-29 13:19:55.698222: I external/local_tsl/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
|
11 |
+
2024-09-29 13:19:56.301373: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
|
12 |
+
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
|
13 |
+
2024-09-29 13:19:58.557393: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
|
14 |
+
Global seed set to 2300
|
15 |
+
[09/29 13:20:06 VTDM]: Running on GPUs 7,
|
16 |
+
[09/29 13:20:06 VTDM]: Use the strategy of deepspeed_stage_2
|
17 |
+
[09/29 13:20:06 VTDM]: Pytorch lightning trainer config:
|
18 |
+
{'gpus': '7,', 'logger_refresh_rate': 5, 'check_val_every_n_epoch': 1, 'max_epochs': 50, 'accelerator': 'cuda', 'strategy': 'deepspeed_stage_2', 'precision': 16}
|
19 |
+
VideoTransformerBlock is using checkpointing
|
20 |
+
VideoTransformerBlock is using checkpointing
|
21 |
+
VideoTransformerBlock is using checkpointing
|
22 |
+
VideoTransformerBlock is using checkpointing
|
23 |
+
VideoTransformerBlock is using checkpointing
|
24 |
+
VideoTransformerBlock is using checkpointing
|
25 |
+
VideoTransformerBlock is using checkpointing
|
26 |
+
VideoTransformerBlock is using checkpointing
|
27 |
+
VideoTransformerBlock is using checkpointing
|
28 |
+
VideoTransformerBlock is using checkpointing
|
29 |
+
VideoTransformerBlock is using checkpointing
|
30 |
+
VideoTransformerBlock is using checkpointing
|
31 |
+
VideoTransformerBlock is using checkpointing
|
32 |
+
VideoTransformerBlock is using checkpointing
|
33 |
+
VideoTransformerBlock is using checkpointing
|
34 |
+
VideoTransformerBlock is using checkpointing
|
35 |
+
Initialized embedder #0: FrozenOpenCLIPImagePredictionEmbedder with 683800065 params. Trainable: False
|
36 |
+
Initialized embedder #1: AesEmbedder with 343490018 params. Trainable: False
|
37 |
+
Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False
|
38 |
+
Initialized embedder #3: VideoPredictionEmbedderWithEncoder with 83653863 params. Trainable: False
|
39 |
+
Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False
|
40 |
+
Restored from /mnt/afs_intern/yanghaibo/datas/download_checkpoints/svd_checkpoints/stable-video-diffusion-img2vid-xt/svd_xt_image_decoder.safetensors with 312 missing and 0 unexpected keys
|
41 |
+
Missing Keys: ['conditioner.embedders.1.aesthetic_model.positional_embedding', 'conditioner.embedders.1.aesthetic_model.text_projection', 'conditioner.embedders.1.aesthetic_model.logit_scale', 'conditioner.embedders.1.aesthetic_model.visual.class_embedding', 'conditioner.embedders.1.aesthetic_model.visual.positional_embedding', 'conditioner.embedders.1.aesthetic_model.visual.proj', 'conditioner.embedders.1.aesthetic_model.visual.conv1.weight', 'conditioner.embedders.1.aesthetic_model.visual.ln_pre.weight', 'conditioner.embedders.1.aesthetic_model.visual.ln_pre.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.0.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.1.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.2.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.3.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.4.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.5.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.6.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.7.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.8.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.9.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.10.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.11.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.12.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.13.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.14.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.15.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.16.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.17.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.18.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.19.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.20.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.21.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.22.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.attn.in_proj_weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.attn.in_proj_bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.attn.out_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.attn.out_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.ln_1.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.ln_1.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.mlp.c_fc.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.mlp.c_fc.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.mlp.c_proj.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.mlp.c_proj.bias', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.ln_2.weight', 'conditioner.embedders.1.aesthetic_model.visual.transformer.resblocks.23.ln_2.bias', 'conditioner.embedders.1.aesthetic_model.visual.ln_post.weight', 'conditioner.embedders.1.aesthetic_model.visual.ln_post.bias', 'conditioner.embedders.1.aesthetic_model.token_embedding.weight', 'conditioner.embedders.1.aesthetic_model.ln_final.weight', 'conditioner.embedders.1.aesthetic_model.ln_final.bias', 'conditioner.embedders.1.aesthetic_mlp.layers.0.weight', 'conditioner.embedders.1.aesthetic_mlp.layers.0.bias', 'conditioner.embedders.1.aesthetic_mlp.layers.2.weight', 'conditioner.embedders.1.aesthetic_mlp.layers.2.bias', 'conditioner.embedders.1.aesthetic_mlp.layers.4.weight', 'conditioner.embedders.1.aesthetic_mlp.layers.4.bias', 'conditioner.embedders.1.aesthetic_mlp.layers.6.weight', 'conditioner.embedders.1.aesthetic_mlp.layers.6.bias', 'conditioner.embedders.1.aesthetic_mlp.layers.7.weight', 'conditioner.embedders.1.aesthetic_mlp.layers.7.bias']
|
42 |
+
/mnt/afs_intern/yanghaibo/installed/anaconda3/envs/general/lib/python3.10/site-packages/pytorch_lightning/loggers/test_tube.py:104: LightningDeprecationWarning: The TestTubeLogger is deprecated since v1.5 and will be removed in v1.7. We recommend switching to the `pytorch_lightning.loggers.TensorBoardLogger` as an alternative.
|
43 |
+
rank_zero_deprecation(
|
44 |
+
[09/29 13:21:06 VTDM]: Merged modelckpt-cfg:
|
45 |
+
{'target': 'pytorch_lightning.callbacks.ModelCheckpoint', 'params': {'dirpath': 'outputs/logs/train-v01/2024-09-29T13-20-04_train-v01_00/checkpoints', 'filename': '{epoch:06}', 'verbose': True, 'save_weights_only': True}}
|
46 |
+
[09/29 13:21:06 VTDM]: Caution: Saving checkpoints every n train steps without deleting. This might require some free space.
|
47 |
+
[09/29 13:21:06 VTDM]: Merged trainsteps-cfg:
|
48 |
+
{'target': 'pytorch_lightning.callbacks.ModelCheckpoint', 'params': {'dirpath': 'outputs/logs/train-v01/2024-09-29T13-20-04_train-v01_00/checkpoints/trainstep_checkpoints', 'filename': '{epoch:06}-{step:09}', 'verbose': True, 'save_top_k': -1, 'every_n_train_steps': 3000, 'save_weights_only': False}}
|
49 |
+
[09/29 13:21:06 VTDM]: Done in building trainer kwargs.
|
50 |
+
GPU available: True, used: True
|
51 |
+
TPU available: False, using: 0 TPU cores
|
52 |
+
IPU available: False, using: 0 IPUs
|
53 |
+
============= length of dataset 1 =============
|
54 |
+
[09/29 13:21:07 VTDM]: Set up dataset.
|
55 |
+
[09/29 13:21:07 VTDM]: accumulate_grad_batches = 1
|
56 |
+
[09/29 13:21:07 VTDM]: Setting learning rate to 1.00e-05 = 1 (accumulate_grad_batches) * 1 (num_gpus) * 1 (batchsize) * 1.00e-05 (base_lr)
|
57 |
+
/mnt/afs_intern/yanghaibo/installed/anaconda3/envs/general/lib/python3.10/site-packages/pytorch_lightning/trainer/configuration_validator.py:116: UserWarning: You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.
|
58 |
+
rank_zero_warn("You passed in a `val_dataloader` but have no `validation_step`. Skipping val loop.")
|
59 |
+
/mnt/afs_intern/yanghaibo/installed/anaconda3/envs/general/lib/python3.10/site-packages/pytorch_lightning/trainer/configuration_validator.py:271: LightningDeprecationWarning: The `on_keyboard_interrupt` callback hook was deprecated in v1.5 and will be removed in v1.7. Please use the `on_exception` callback hook instead.
|
60 |
+
rank_zero_deprecation(
|
61 |
+
/mnt/afs_intern/yanghaibo/installed/anaconda3/envs/general/lib/python3.10/site-packages/pytorch_lightning/trainer/configuration_validator.py:287: LightningDeprecationWarning: Base `Callback.on_train_batch_end` hook signature has changed in v1.5. The `dataloader_idx` argument will be removed in v1.7.
|
62 |
+
rank_zero_deprecation(
|
63 |
+
Global seed set to 2300
|
64 |
+
initializing deepspeed distributed: GLOBAL_RANK: 0, MEMBER: 1/1
|
65 |
+
/mnt/afs_intern/yanghaibo/installed/anaconda3/envs/general/lib/python3.10/site-packages/pytorch_lightning/plugins/training_type/deepspeed.py:625: UserWarning: Inferring the batch size for internal deepspeed logging from the `train_dataloader()`. If you require skipping this, please pass `Trainer(strategy=DeepSpeedPlugin(logging_batch_size_per_gpu=batch_size))`
|
66 |
+
rank_zero_warn(
|
67 |
+
Enabling DeepSpeed FP16.
|
68 |
+
/mnt/afs_intern/yanghaibo/installed/anaconda3/envs/general/lib/python3.10/site-packages/pytorch_lightning/core/datamodule.py:469: LightningDeprecationWarning: DataModule.setup has already been called, so it will not be called again. In v1.6 this behavior will change to always call DataModule.setup.
|
69 |
+
rank_zero_deprecation(
|
70 |
+
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2,3,4,5,6,7]
|
71 |
+
You have not specified an optimizer or scheduler within the DeepSpeed config. Using `configure_optimizers` to define optimizer and scheduler.
|
72 |
+
Project config
|
73 |
+
data:
|
74 |
+
target: sgm.data.video_dataset.VideoDataset
|
75 |
+
params:
|
76 |
+
base_folder: datas/OBJAVERSE-LVIS-example/images
|
77 |
+
eval_folder: validation_set_example
|
78 |
+
width: 512
|
79 |
+
height: 512
|
80 |
+
sample_frames: 16
|
81 |
+
batch_size: 1
|
82 |
+
num_workers: 1
|
83 |
+
model:
|
84 |
+
target: vtdm.vtdm_gen_v01.VideoLDM
|
85 |
+
base_learning_rate: 1.0e-05
|
86 |
+
params:
|
87 |
+
input_key: video
|
88 |
+
scale_factor: 0.18215
|
89 |
+
log_keys: caption
|
90 |
+
num_samples: 16
|
91 |
+
trained_param_keys:
|
92 |
+
- all
|
93 |
+
en_and_decode_n_samples_a_time: 16
|
94 |
+
disable_first_stage_autocast: true
|
95 |
+
ckpt_path: /mnt/afs_intern/yanghaibo/datas/download_checkpoints/svd_checkpoints/stable-video-diffusion-img2vid-xt/svd_xt_image_decoder.safetensors
|
96 |
+
denoiser_config:
|
97 |
+
target: sgm.modules.diffusionmodules.denoiser.Denoiser
|
98 |
+
params:
|
99 |
+
scaling_config:
|
100 |
+
target: sgm.modules.diffusionmodules.denoiser_scaling.VScalingWithEDMcNoise
|
101 |
+
network_config:
|
102 |
+
target: sgm.modules.diffusionmodules.video_model.VideoUNet
|
103 |
+
params:
|
104 |
+
adm_in_channels: 768
|
105 |
+
num_classes: sequential
|
106 |
+
use_checkpoint: true
|
107 |
+
in_channels: 8
|
108 |
+
out_channels: 4
|
109 |
+
model_channels: 320
|
110 |
+
attention_resolutions:
|
111 |
+
- 4
|
112 |
+
- 2
|
113 |
+
- 1
|
114 |
+
num_res_blocks: 2
|
115 |
+
channel_mult:
|
116 |
+
- 1
|
117 |
+
- 2
|
118 |
+
- 4
|
119 |
+
- 4
|
120 |
+
num_head_channels: 64
|
121 |
+
use_linear_in_transformer: true
|
122 |
+
transformer_depth: 1
|
123 |
+
context_dim: 1024
|
124 |
+
spatial_transformer_attn_type: softmax-xformers
|
125 |
+
extra_ff_mix_layer: true
|
126 |
+
use_spatial_context: true
|
127 |
+
merge_strategy: learned_with_images
|
128 |
+
video_kernel_size:
|
129 |
+
- 3
|
130 |
+
- 1
|
131 |
+
- 1
|
132 |
+
conditioner_config:
|
133 |
+
target: sgm.modules.GeneralConditioner
|
134 |
+
params:
|
135 |
+
emb_models:
|
136 |
+
- is_trainable: false
|
137 |
+
input_key: cond_frames_without_noise
|
138 |
+
ucg_rate: 0.1
|
139 |
+
target: sgm.modules.encoders.modules.FrozenOpenCLIPImagePredictionEmbedder
|
140 |
+
params:
|
141 |
+
n_cond_frames: 1
|
142 |
+
n_copies: 1
|
143 |
+
open_clip_embedding_config:
|
144 |
+
target: sgm.modules.encoders.modules.FrozenOpenCLIPImageEmbedder
|
145 |
+
params:
|
146 |
+
version: ckpts/open_clip_pytorch_model.bin
|
147 |
+
freeze: true
|
148 |
+
- is_trainable: false
|
149 |
+
input_key: video
|
150 |
+
ucg_rate: 0.0
|
151 |
+
target: vtdm.encoders.AesEmbedder
|
152 |
+
- is_trainable: false
|
153 |
+
input_key: elevation
|
154 |
+
target: sgm.modules.encoders.modules.ConcatTimestepEmbedderND
|
155 |
+
params:
|
156 |
+
outdim: 256
|
157 |
+
- input_key: cond_frames
|
158 |
+
is_trainable: false
|
159 |
+
ucg_rate: 0.1
|
160 |
+
target: sgm.modules.encoders.modules.VideoPredictionEmbedderWithEncoder
|
161 |
+
params:
|
162 |
+
disable_encoder_autocast: true
|
163 |
+
n_cond_frames: 1
|
164 |
+
n_copies: 16
|
165 |
+
is_ae: true
|
166 |
+
encoder_config:
|
167 |
+
target: sgm.models.autoencoder.AutoencoderKLModeOnly
|
168 |
+
params:
|
169 |
+
embed_dim: 4
|
170 |
+
monitor: val/rec_loss
|
171 |
+
ddconfig:
|
172 |
+
attn_type: vanilla-xformers
|
173 |
+
double_z: true
|
174 |
+
z_channels: 4
|
175 |
+
resolution: 256
|
176 |
+
in_channels: 3
|
177 |
+
out_ch: 3
|
178 |
+
ch: 128
|
179 |
+
ch_mult:
|
180 |
+
- 1
|
181 |
+
- 2
|
182 |
+
- 4
|
183 |
+
- 4
|
184 |
+
num_res_blocks: 2
|
185 |
+
attn_resolutions: []
|
186 |
+
dropout: 0.0
|
187 |
+
lossconfig:
|
188 |
+
target: torch.nn.Identity
|
189 |
+
- input_key: cond_aug
|
190 |
+
is_trainable: false
|
191 |
+
target: sgm.modules.encoders.modules.ConcatTimestepEmbedderND
|
192 |
+
params:
|
193 |
+
outdim: 256
|
194 |
+
first_stage_config:
|
195 |
+
target: sgm.models.autoencoder.AutoencoderKL
|
196 |
+
params:
|
197 |
+
embed_dim: 4
|
198 |
+
monitor: val/rec_loss
|
199 |
+
ddconfig:
|
200 |
+
attn_type: vanilla-xformers
|
201 |
+
double_z: true
|
202 |
+
z_channels: 4
|
203 |
+
resolution: 256
|
204 |
+
in_channels: 3
|
205 |
+
out_ch: 3
|
206 |
+
ch: 128
|
207 |
+
ch_mult:
|
208 |
+
- 1
|
209 |
+
- 2
|
210 |
+
- 4
|
211 |
+
- 4
|
212 |
+
num_res_blocks: 2
|
213 |
+
attn_resolutions: []
|
214 |
+
dropout: 0.0
|
215 |
+
lossconfig:
|
216 |
+
target: torch.nn.Identity
|
217 |
+
loss_fn_config:
|
218 |
+
target: sgm.modules.diffusionmodules.loss.StandardDiffusionLoss
|
219 |
+
params:
|
220 |
+
num_frames: 16
|
221 |
+
batch2model_keys:
|
222 |
+
- num_video_frames
|
223 |
+
- image_only_indicator
|
224 |
+
sigma_sampler_config:
|
225 |
+
target: sgm.modules.diffusionmodules.sigma_sampling.EDMSampling
|
226 |
+
params:
|
227 |
+
p_mean: 1.0
|
228 |
+
p_std: 1.6
|
229 |
+
loss_weighting_config:
|
230 |
+
target: sgm.modules.diffusionmodules.loss_weighting.VWeighting
|
231 |
+
sampler_config:
|
232 |
+
target: sgm.modules.diffusionmodules.sampling.EulerEDMSampler
|
233 |
+
params:
|
234 |
+
num_steps: 25
|
235 |
+
verbose: true
|
236 |
+
discretization_config:
|
237 |
+
target: sgm.modules.diffusionmodules.discretizer.EDMDiscretization
|
238 |
+
params:
|
239 |
+
sigma_max: 700.0
|
240 |
+
guider_config:
|
241 |
+
target: sgm.modules.diffusionmodules.guiders.LinearPredictionGuider
|
242 |
+
params:
|
243 |
+
num_frames: 16
|
244 |
+
max_scale: 2.5
|
245 |
+
min_scale: 1.0
|
246 |
+
|
247 |
+
Lightning config
|
248 |
+
trainer:
|
249 |
+
gpus: 7,
|
250 |
+
logger_refresh_rate: 5
|
251 |
+
check_val_every_n_epoch: 1
|
252 |
+
max_epochs: 50
|
253 |
+
accelerator: cuda
|
254 |
+
strategy: deepspeed_stage_2
|
255 |
+
precision: 16
|
256 |
+
callbacks:
|
257 |
+
image_logger:
|
258 |
+
target: vtdm.callbacks.ImageLogger
|
259 |
+
params:
|
260 |
+
log_on_batch_idx: true
|
261 |
+
increase_log_steps: false
|
262 |
+
log_first_step: true
|
263 |
+
batch_frequency: 200
|
264 |
+
max_images: 8
|
265 |
+
clamp: true
|
266 |
+
log_images_kwargs:
|
267 |
+
'N': 8
|
268 |
+
sample: true
|
269 |
+
ucg_keys:
|
270 |
+
- cond_frames
|
271 |
+
- cond_frames_without_noise
|
272 |
+
metrics_over_trainsteps_checkpoint:
|
273 |
+
target: pytorch_lightning.callbacks.ModelCheckpoint
|
274 |
+
params:
|
275 |
+
every_n_train_steps: 3000
|
276 |
+
save_weights_only: false
|
277 |
+
|
278 |
+
|
279 |
+
| Name | Type | Params
|
280 |
+
------------------------------------------------------------
|
281 |
+
0 | model | OpenAIWrapper | 1.5 B
|
282 |
+
1 | denoiser | Denoiser | 0
|
283 |
+
2 | conditioner | GeneralConditioner | 1.1 B
|
284 |
+
3 | first_stage_model | AutoencoderKL | 83.7 M
|
285 |
+
4 | loss_fn | StandardDiffusionLoss | 0
|
286 |
+
------------------------------------------------------------
|
287 |
+
1.5 B Trainable params
|
288 |
+
1.2 B Non-trainable params
|
289 |
+
2.7 B Total params
|
290 |
+
5,438.442 Total estimated model params size (MB)
|
291 |
+
/mnt/afs_intern/yanghaibo/installed/anaconda3/envs/general/lib/python3.10/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:617: UserWarning: Checkpoint directory outputs/logs/train-v01/2024-09-29T13-20-04_train-v01_00/checkpoints exists and is not empty.
|
292 |
+
rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
|
293 |
+
[09/29 13:21:13 VTDM]: Epoch: 0, batch_num: inf
|
294 |
+
/mnt/afs_intern/yanghaibo/installed/anaconda3/envs/general/lib/python3.10/site-packages/pytorch_lightning/utilities/data.py:56: UserWarning: Trying to infer the `batch_size` from an ambiguous collection. The batch size we found is 1. To avoid any miscalculations, use `self.log(..., batch_size=batch_size)`.
|
295 |
+
warning_cache.warn(
|
296 |
+
############################## Sampling setting ##############################
|
297 |
+
Sampler: EulerEDMSampler
|
298 |
+
Discretization: EDMDiscretization
|
299 |
+
Guider: LinearPredictionGuider
|
300 |
+
Sampling with EulerEDMSampler for 26 steps: 0%| | 0/26 [00:00<?, ?it/s]/mnt/afs_intern/yanghaibo/installed/anaconda3/envs/general/lib/python3.10/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
|
301 |
+
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
|
302 |
+
Sampling with EulerEDMSampler for 26 steps: 96%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 25/26 [00:36<00:01, 1.48s/it]
|
303 |
+
[09/29 13:22:36 VTDM]: [Epoch 0] [Batch 5/inf 16.49 s/batch] => loss: 0.025749
|
304 |
+
[09/29 13:22:48 VTDM]: [Epoch 0] [Batch 10/inf 9.47 s/batch] => loss: 0.034793
|
305 |
+
Average Epoch time: 94.71 seconds
|
306 |
+
Average Peak memory 43059.05 MiB
|
307 |
+
[09/29 13:23:36 VTDM]: Epoch: 1, batch_num: inf
|
308 |
+
############################## Sampling setting ##############################
|
309 |
+
Sampler: EulerEDMSampler
|
310 |
+
Discretization: EDMDiscretization
|
311 |
+
Guider: LinearPredictionGuider
|
312 |
+
Sampling with EulerEDMSampler for 26 steps: 96%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 25/26 [00:25<00:01, 1.02s/it]
|
313 |
+
[09/29 13:24:19 VTDM]: [Epoch 1] [Batch 5/inf 8.63 s/batch] => loss: 0.030595
|
314 |
+
[09/29 13:24:31 VTDM]: [Epoch 1] [Batch 10/inf 5.54 s/batch] => loss: 0.113596
|
315 |
+
Average Epoch time: 55.36 seconds
|
316 |
+
Average Peak memory 43059.05 MiB
|
317 |
+
[09/29 13:25:18 VTDM]: Epoch: 2, batch_num: inf
|
318 |
+
############################## Sampling setting ##############################
|
319 |
+
Sampler: EulerEDMSampler
|
320 |
+
Discretization: EDMDiscretization
|
321 |
+
Guider: LinearPredictionGuider
|
322 |
+
Sampling with EulerEDMSampler for 26 steps: 96%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 25/26 [00:25<00:01, 1.02s/it]
|
323 |
+
[09/29 13:26:01 VTDM]: [Epoch 2] [Batch 5/inf 8.58 s/batch] => loss: 0.127297
|
324 |
+
[09/29 13:26:13 VTDM]: [Epoch 2] [Batch 10/inf 5.51 s/batch] => loss: 0.039552
|
325 |
+
Average Epoch time: 55.14 seconds
|
326 |
+
Average Peak memory 43059.05 MiB
|
327 |
+
[09/29 13:26:59 VTDM]: Epoch: 3, batch_num: inf
|
328 |
+
############################## Sampling setting ##############################
|