|
|
[2025-10-24 11:27:55,703][main][INFO] - Will write tensorboard logs inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/tensorboard_logs |
|
|
[2025-10-24 11:27:55,722][main][INFO] - Runtime at /workspace/DC_SSDAE |
|
|
[2025-10-24 11:27:55,723][main][INFO] - Running inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM |
|
|
[2025-10-24 11:27:55,724][main][INFO] - Running args: ['main.py', 'run_name=train_enc_vq_f8c4_FM', 'dataset.im_size=128', 'dataset.aug_scale=2', 'training.epochs=20', 'dc_ssdae.encoder_train=true'] |
|
|
[2025-10-24 11:27:55,725][main][INFO] - Command: 'main.py' 'run_name=train_enc_vq_f8c4_FM' 'dataset.im_size=128' 'dataset.aug_scale=2' 'training.epochs=20' 'dc_ssdae.encoder_train=true' |
|
|
[2025-10-24 11:27:55,726][main][INFO] - Accelerator with 8 processes, running on cuda:0 |
|
|
[2025-10-24 11:27:55,729][main][INFO] - Hydra configuration: |
|
|
seed: 0 |
|
|
task: train |
|
|
runtime_path: ${hydra:runtime.cwd} |
|
|
ckpt_dir: ${runtime_path}/runs |
|
|
run_name: train_enc_vq_f8c4_FM |
|
|
cache_dir: ${ckpt_dir}/cache |
|
|
run_dir: ${ckpt_dir}/jobs/${run_name} |
|
|
checkpoint_path: ${run_dir}/checkpoints |
|
|
dataset: |
|
|
imagenet_root: imagenet_data |
|
|
im_size: 128 |
|
|
batch_size: 192 |
|
|
aug_scale: 2 |
|
|
limit: null |
|
|
distill_teacher: false |
|
|
dc_ssdae: |
|
|
compile: false |
|
|
checkpoint: null |
|
|
encoder: f8c4 |
|
|
encoder_checkpoint: null |
|
|
encoder_train: true |
|
|
decoder: S |
|
|
trainer_type: FM |
|
|
encoder_type: vq |
|
|
sampler: |
|
|
steps: 10 |
|
|
ema: |
|
|
decay: 0.999 |
|
|
start_iter: 50000 |
|
|
aux_losses: |
|
|
compile: ${dc_ssdae.compile} |
|
|
repa: |
|
|
i_extract: 4 |
|
|
n_layers: 2 |
|
|
lpips: true |
|
|
training: |
|
|
sdpa_kernel: 2 |
|
|
mixed_precision: bf16 |
|
|
grad_accumulate: 1 |
|
|
grad_clip: 0.1 |
|
|
epochs: 20 |
|
|
eval_freq: 1 |
|
|
save_on_best: FID |
|
|
log_freq: 100 |
|
|
lr: 0.0003 |
|
|
weight_decay: 0.001 |
|
|
losses: |
|
|
diffusion: 1 |
|
|
repa: 0.25 |
|
|
lpips: 0.5 |
|
|
kl: 1.0e-06 |
|
|
show_samples: 8 |
|
|
|
|
|
|
|
|
|
|
|
[2025-10-24 11:28:09,494][main][INFO] - Loaded ImageNet dataset: {'train': Dataset ImageNet |
|
|
Number of datapoints: 1279867 |
|
|
Root location: ../../../imagenet_data |
|
|
Split: train |
|
|
StandardTransform |
|
|
Transform: Compose( |
|
|
RandomResize(min_size=128, max_size=256, interpolation=InterpolationMode.LANCZOS, antialias=True) |
|
|
RandomCrop(size=(128, 128), pad_if_needed=False, fill=0, padding_mode=constant) |
|
|
RandomHorizontalFlip(p=0.5) |
|
|
ToImage() |
|
|
ToDtype(scale=True) |
|
|
Normalize(mean=[0.5], std=[0.5], inplace=False) |
|
|
), 'test': Dataset ImageNet |
|
|
Number of datapoints: 49950 |
|
|
Root location: ../../../imagenet_data |
|
|
Split: validation |
|
|
StandardTransform |
|
|
Transform: Compose( |
|
|
Resize(size=[128], interpolation=InterpolationMode.BILINEAR, antialias=True) |
|
|
CenterCrop(size=(128, 128)) |
|
|
ToImage() |
|
|
ToDtype(scale=True) |
|
|
Normalize(mean=[0.5], std=[0.5], inplace=False) |
|
|
)} |
|
|
[2025-10-24 11:28:18,537][main][INFO] - ae parameters count: |
|
|
[2025-10-24 11:28:18,540][main][INFO] - Total: |
|
|
[2025-10-24 11:28:18,541][main][INFO] - - encoder: |
|
|
[2025-10-24 11:28:18,542][main][INFO] - - conv_in: |
|
|
[2025-10-24 11:28:18,543][main][INFO] - - down: |
|
|
[2025-10-24 11:28:18,543][main][INFO] - - mid: |
|
|
[2025-10-24 11:28:18,544][main][INFO] - - norm_out: |
|
|
[2025-10-24 11:28:18,545][main][INFO] - - act_out: |
|
|
[2025-10-24 11:28:18,545][main][INFO] - - conv_out: |
|
|
[2025-10-24 11:28:18,546][main][INFO] - - out_proj: |
|
|
[2025-10-24 11:28:18,547][main][INFO] - - decoder: |
|
|
[2025-10-24 11:28:18,548][main][INFO] - - conv_in_img: |
|
|
[2025-10-24 11:28:18,548][main][INFO] - - conv_in_z: |
|
|
[2025-10-24 11:28:18,549][main][INFO] - - conv_in: |
|
|
[2025-10-24 11:28:18,550][main][INFO] - - batch_norm_z: |
|
|
[2025-10-24 11:28:18,550][main][INFO] - - time_proj: |
|
|
[2025-10-24 11:28:18,551][main][INFO] - - time_embedding: |
|
|
[2025-10-24 11:28:18,551][main][INFO] - - ada_ctx_proj: |
|
|
[2025-10-24 11:28:18,552][main][INFO] - - down_blocks: |
|
|
[2025-10-24 11:28:18,553][main][INFO] - - mid_block: |
|
|
[2025-10-24 11:28:18,554][main][INFO] - - up_blocks: |
|
|
[2025-10-24 11:28:18,554][main][INFO] - - conv_norm_out: |
|
|
[2025-10-24 11:28:18,555][main][INFO] - - conv_out_act: |
|
|
[2025-10-24 11:28:18,555][main][INFO] - - conv_out: |
|
|
[2025-10-24 11:28:18,557][main][INFO] - ae: EMAWrapper( |
|
|
(model): DistributedDataParallel( |
|
|
(module): DC_SSDAE( |
|
|
(encoder): VQEncoder( |
|
|
(conv_in): Conv2d(3, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(down): ModuleList( |
|
|
(0): Module( |
|
|
(block): ModuleList( |
|
|
(0-1): 2 x VQGResnetBlock( |
|
|
(norm1): GroupNorm(32, 128, eps=1e-06, affine=True) |
|
|
(act1): SwishActivation() |
|
|
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(norm2): GroupNorm(32, 128, eps=1e-06, affine=True) |
|
|
(act2): SwishActivation() |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
(attn): ModuleList() |
|
|
(downsample): VQGDownsample( |
|
|
(conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2)) |
|
|
) |
|
|
) |
|
|
(1): Module( |
|
|
(block): ModuleList( |
|
|
(0): VQGResnetBlock( |
|
|
(norm1): GroupNorm(32, 128, eps=1e-06, affine=True) |
|
|
(act1): SwishActivation() |
|
|
(conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(norm2): GroupNorm(32, 256, eps=1e-06, affine=True) |
|
|
(act2): SwishActivation() |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nin_shortcut): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(1): VQGResnetBlock( |
|
|
(norm1): GroupNorm(32, 256, eps=1e-06, affine=True) |
|
|
(act1): SwishActivation() |
|
|
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(norm2): GroupNorm(32, 256, eps=1e-06, affine=True) |
|
|
(act2): SwishActivation() |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
(attn): ModuleList() |
|
|
(downsample): VQGDownsample( |
|
|
(conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2)) |
|
|
) |
|
|
) |
|
|
(2): Module( |
|
|
(block): ModuleList( |
|
|
(0): VQGResnetBlock( |
|
|
(norm1): GroupNorm(32, 256, eps=1e-06, affine=True) |
|
|
(act1): SwishActivation() |
|
|
(conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act2): SwishActivation() |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nin_shortcut): Conv2d(256, 512, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(1): VQGResnetBlock( |
|
|
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act1): SwishActivation() |
|
|
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act2): SwishActivation() |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
(attn): ModuleList() |
|
|
(downsample): VQGDownsample( |
|
|
(conv): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2)) |
|
|
) |
|
|
) |
|
|
(3): Module( |
|
|
(block): ModuleList( |
|
|
(0-1): 2 x VQGResnetBlock( |
|
|
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act1): SwishActivation() |
|
|
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act2): SwishActivation() |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
(attn): ModuleList() |
|
|
) |
|
|
) |
|
|
(mid): Module( |
|
|
(block_1): VQGResnetBlock( |
|
|
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act1): SwishActivation() |
|
|
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act2): SwishActivation() |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
(attn_1): VQGAttnBlock( |
|
|
(norm): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(q): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1)) |
|
|
(k): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1)) |
|
|
(v): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1)) |
|
|
(proj_out): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(block_2): VQGResnetBlock( |
|
|
(norm1): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act1): SwishActivation() |
|
|
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(norm2): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act2): SwishActivation() |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
(norm_out): GroupNorm(32, 512, eps=1e-06, affine=True) |
|
|
(act_out): SwishActivation() |
|
|
(conv_out): Conv2d(512, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(out_proj): Conv2d(8, 8, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(decoder): UViTDecoder( |
|
|
(conv_in_img): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(conv_in_z): Conv2d(4, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(conv_in): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(batch_norm_z): BatchNorm2d(4, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) |
|
|
(time_proj): Timesteps() |
|
|
(time_embedding): TimestepEmbedding( |
|
|
(linear_1): Linear(in_features=64, out_features=256, bias=True) |
|
|
(act): SiLU() |
|
|
(linear_2): Linear(in_features=256, out_features=256, bias=True) |
|
|
) |
|
|
(ada_ctx_proj): Sequential( |
|
|
(0): Conv2d(4, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(1): SiLU() |
|
|
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
(down_blocks): ModuleList( |
|
|
(0): DownBlock2D( |
|
|
(resnets): ModuleList( |
|
|
(0-1): 2 x ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=128, bias=True) |
|
|
(norm2): GroupNorm(32, 64, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
) |
|
|
) |
|
|
(downsamplers): ModuleList( |
|
|
(0): Downsample2D( |
|
|
(conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
) |
|
|
(1): DownBlock2D( |
|
|
(resnets): ModuleList( |
|
|
(0): ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(64, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=192, bias=True) |
|
|
(norm2): GroupNorm(32, 96, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(64, 96, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(1): ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=192, bias=True) |
|
|
(norm2): GroupNorm(32, 96, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
) |
|
|
) |
|
|
(downsamplers): ModuleList( |
|
|
(0): Downsample2D( |
|
|
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
) |
|
|
(2): DownBlock2D( |
|
|
(resnets): ModuleList( |
|
|
(0): ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 192, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(96, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=320, bias=True) |
|
|
(norm2): GroupNorm(32, 160, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(96, 160, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(1): ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 320, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=320, bias=True) |
|
|
(norm2): GroupNorm(32, 160, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
) |
|
|
) |
|
|
(downsamplers): ModuleList( |
|
|
(0): Downsample2D( |
|
|
(conv): Conv2d(160, 160, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
) |
|
|
(3): DownBlock2D( |
|
|
(resnets): ModuleList( |
|
|
(0-1): 2 x ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 320, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=320, bias=True) |
|
|
(norm2): GroupNorm(32, 160, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
) |
|
|
) |
|
|
) |
|
|
) |
|
|
(mid_block): UViTMiddleTransformer( |
|
|
(proj_in): Linear(in_features=160, out_features=160, bias=True) |
|
|
(transformer_blocks): ModuleList( |
|
|
(0-7): 8 x TransformerBlock( |
|
|
(norm1): AdaLayerNorm( |
|
|
(silu): SiLU() |
|
|
(linear): Linear(in_features=64, out_features=320, bias=True) |
|
|
(norm): LayerNorm((160,), eps=1e-05, elementwise_affine=False) |
|
|
) |
|
|
(attn1): Attention( |
|
|
(to_q): Linear(in_features=160, out_features=160, bias=False) |
|
|
(to_k): Linear(in_features=160, out_features=160, bias=False) |
|
|
(to_v): Linear(in_features=160, out_features=160, bias=False) |
|
|
(out_proj): Linear(in_features=160, out_features=160, bias=True) |
|
|
(out_drop): Dropout(p=0.0, inplace=False) |
|
|
) |
|
|
(norm2): LayerNorm((160,), eps=1e-05, elementwise_affine=True) |
|
|
(ff): FeedForward( |
|
|
(proj_in_act): GEGLU( |
|
|
(proj): Linear(in_features=160, out_features=1280, bias=True) |
|
|
) |
|
|
(drop): Dropout(p=0.0, inplace=False) |
|
|
(proj_out): Linear(in_features=640, out_features=160, bias=True) |
|
|
) |
|
|
(relative_position_bias): RelativePositionBias() |
|
|
) |
|
|
) |
|
|
(proj_out): Linear(in_features=160, out_features=160, bias=True) |
|
|
(norm): GroupNorm(32, 160, eps=1e-06, affine=True) |
|
|
) |
|
|
(up_blocks): ModuleList( |
|
|
(0): UpBlock2D( |
|
|
(resnets): ModuleList( |
|
|
(0-2): 3 x ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 640, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(320, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=320, bias=True) |
|
|
(norm2): GroupNorm(32, 160, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(320, 160, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
) |
|
|
(upsamplers): ModuleList( |
|
|
(0): Upsample2D( |
|
|
(conv): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
) |
|
|
(1): UpBlock2D( |
|
|
(resnets): ModuleList( |
|
|
(0-1): 2 x ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 640, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(320, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=320, bias=True) |
|
|
(norm2): GroupNorm(32, 160, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(320, 160, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(2): ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 512, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(256, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=320, bias=True) |
|
|
(norm2): GroupNorm(32, 160, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(256, 160, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
) |
|
|
(upsamplers): ModuleList( |
|
|
(0): Upsample2D( |
|
|
(conv): Conv2d(160, 160, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
) |
|
|
(2): UpBlock2D( |
|
|
(resnets): ModuleList( |
|
|
(0): ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 512, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(256, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=192, bias=True) |
|
|
(norm2): GroupNorm(32, 96, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(256, 96, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(1): ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 384, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(192, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=192, bias=True) |
|
|
(norm2): GroupNorm(32, 96, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(2): ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 320, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(160, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=192, bias=True) |
|
|
(norm2): GroupNorm(32, 96, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(160, 96, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
) |
|
|
(upsamplers): ModuleList( |
|
|
(0): Upsample2D( |
|
|
(conv): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
) |
|
|
(3): UpBlock2D( |
|
|
(resnets): ModuleList( |
|
|
(0): ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 320, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(160, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=128, bias=True) |
|
|
(norm2): GroupNorm(32, 64, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(160, 64, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(1-2): 2 x ResnetBlock2D( |
|
|
(norm1): AdaGroupNorm2D( |
|
|
(ctx_proj): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
(conv1): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(time_emb_proj): Linear(in_features=256, out_features=128, bias=True) |
|
|
(norm2): GroupNorm(32, 64, eps=1e-05, affine=True) |
|
|
(dropout): Dropout(p=0.0, inplace=False) |
|
|
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
(nonlinearity): SiLU() |
|
|
(conv_shortcut): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1)) |
|
|
) |
|
|
) |
|
|
) |
|
|
) |
|
|
(conv_norm_out): GroupNorm(32, 64, eps=1e-05, affine=True) |
|
|
(conv_out_act): SiLU() |
|
|
(conv_out): Conv2d(64, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) |
|
|
) |
|
|
) |
|
|
) |
|
|
(ema): EMA(ema_model=DC_SSDAE, decay=0.999, start_iter=50000) |
|
|
) |
|
|
[2025-10-24 11:28:18,558][main][INFO] - aux_losses parameters count: |
|
|
[2025-10-24 11:28:18,559][main][INFO] - Total: |
|
|
[2025-10-24 11:28:18,560][main][INFO] - - repa_loss: |
|
|
[2025-10-24 11:28:18,561][main][INFO] - - lpips_loss: |
|
|
[2025-10-24 11:28:18,561][main][INFO] - aux_losses: DistributedDataParallel( |
|
|
(module): SSDDLosses( |
|
|
(repa_loss): REPALoss( |
|
|
(features_extractor): Frozen(DinoEncoder/Dinov2Model) |
|
|
(repa_mlp): Sequential( |
|
|
(0): Linear(in_features=160, out_features=160, bias=True) |
|
|
(1): SiLU() |
|
|
(2): Linear(in_features=160, out_features=768, bias=True) |
|
|
) |
|
|
(repa_loss): CosineSimilarity() |
|
|
) |
|
|
(lpips_loss): Frozen(LPIPS) |
|
|
) |
|
|
) |
|
|
[2025-10-24 11:28:18,565][main][INFO] - Optimizer for autoencoder: RAdamScheduleFree ( |
|
|
Parameter Group 0 |
|
|
betas: (0.9, 0.999) |
|
|
eps: 1e-08 |
|
|
foreach: True |
|
|
k: 0 |
|
|
lr: 0.0003 |
|
|
lr_max: -1.0 |
|
|
r: 0.0 |
|
|
scheduled_lr: 0.0 |
|
|
silent_sgd_phase: True |
|
|
train_mode: False |
|
|
weight_decay: 0.001 |
|
|
weight_lr_power: 2.0 |
|
|
weight_sum: 0.0 |
|
|
|
|
|
Parameter Group 1 |
|
|
betas: (0.9, 0.999) |
|
|
eps: 1e-08 |
|
|
foreach: True |
|
|
k: 0 |
|
|
lr: 0.0003 |
|
|
lr_max: -1.0 |
|
|
r: 0.0 |
|
|
scheduled_lr: 0.0 |
|
|
silent_sgd_phase: True |
|
|
train_mode: False |
|
|
weight_decay: 0.0 |
|
|
weight_lr_power: 2.0 |
|
|
weight_sum: 0.0 |
|
|
) |
|
|
[2025-10-24 11:28:18,570][main][INFO] - No training state found to resume from None |
|
|
[2025-10-24 11:28:18,571][main][INFO] - ====================== RUNNING TASK train |
|
|
[2025-10-24 11:28:18,572][main][INFO] - Starting training |
|
|
[2025-10-24 11:28:18,572][main][INFO] - Batch size of 192 (24 per GPU, 1 acumulation step(s) 8 process(es)) |
|
|
[2025-10-24 11:28:18,582][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 11:28:18,583][main][INFO] - [T_total=00:00:22 | T_train=00:00:00] Start epoch 0 |
|
|
[2025-10-24 12:31:01,697][main][INFO] - [T_total=01:03:06 | T_train=01:02:43 | T_epoch=01:02:43] End of epoch 0 (6666 steps) train loss 0.379739 |
|
|
[2025-10-24 12:31:01,700][main][INFO] - [Epoch 0] All losses: [[diffusion=0.0877689 ; kl=3611.6 ; lpips=0.251927 ; repa=0.64958]] |
|
|
[2025-10-24 12:34:30,741][main][INFO] - [Epoch 1] Test metrics: [[MSE=14.16 | MAE=0.0884 | LPIPS=0.1332 | PSNR=18.49 | SSIM=0.6156 | dreamsim=0.2237 | FID=21.41]] |
|
|
[2025-10-24 12:34:30,743][main][INFO] - [Epoch 1] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1332 | max_PSNR=18.49 | max_SSIM=0.6156 | min_dreamsim=0.2237 | min_FID=21.41]] |
|
|
[2025-10-24 12:34:30,744][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 12:34:31,976][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 12:34:32,219][main][INFO] - End of epoch timers: [T_train=01:02:43 | T_epoch=01:02:43 | T_eval=00:03:30 | T_total=01:06:36] |
|
|
[2025-10-24 12:34:32,220][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 12:34:34,794][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 12:34:36,825][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 12:34:36,826][main][INFO] - [T_total=01:06:41 | T_train=01:02:43] Start epoch 1 |
|
|
[2025-10-24 13:37:11,223][main][INFO] - [T_total=02:09:15 | T_train=02:05:17 | T_epoch=01:02:34] End of epoch 1 (13332 steps) train loss 0.295457 |
|
|
[2025-10-24 13:37:11,224][main][INFO] - [Epoch 1] All losses: [[diffusion=0.0670763 ; kl=3707.57 ; lpips=0.1699 ; repa=0.55889]] |
|
|
[2025-10-24 13:40:38,432][main][INFO] - [Epoch 2] Test metrics: [[MSE=18.03 | MAE=0.1014 | LPIPS=0.1322 | PSNR=17.44 | SSIM=0.6126 | dreamsim=0.2068 | FID=15.49]] |
|
|
[2025-10-24 13:40:38,434][main][INFO] - [Epoch 2] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1322 | max_PSNR=18.49 | max_SSIM=0.6156 | min_dreamsim=0.2068 | min_FID=15.49]] |
|
|
[2025-10-24 13:40:38,435][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 13:40:39,512][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 13:40:39,760][main][INFO] - End of epoch timers: [T_train=02:05:17 | T_epoch=01:02:34 | T_eval=00:06:58 | T_total=02:12:44] |
|
|
[2025-10-24 13:40:39,762][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 13:40:42,329][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 13:40:44,960][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 13:40:44,961][main][INFO] - [T_total=02:12:49 | T_train=02:05:17] Start epoch 2 |
|
|
[2025-10-24 14:43:20,521][main][INFO] - [T_total=03:15:24 | T_train=03:07:53 | T_epoch=01:02:35] End of epoch 2 (19998 steps) train loss 0.278652 |
|
|
[2025-10-24 14:43:20,523][main][INFO] - [Epoch 2] All losses: [[diffusion=0.0649925 ; kl=3695.06 ; lpips=0.154526 ; repa=0.530805]] |
|
|
[2025-10-24 14:46:47,655][main][INFO] - [Epoch 3] Test metrics: [[MSE=20.47 | MAE=0.1086 | LPIPS=0.13 | PSNR=16.89 | SSIM=0.6123 | dreamsim=0.1954 | FID=12.6]] |
|
|
[2025-10-24 14:46:47,656][main][INFO] - [Epoch 3] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.13 | max_PSNR=18.49 | max_SSIM=0.6156 | min_dreamsim=0.1954 | min_FID=12.6]] |
|
|
[2025-10-24 14:46:47,657][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 14:46:48,722][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 14:46:48,970][main][INFO] - End of epoch timers: [T_train=03:07:53 | T_epoch=01:02:35 | T_eval=00:10:27 | T_total=03:18:53] |
|
|
[2025-10-24 14:46:48,971][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 14:46:51,733][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 14:46:54,361][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 14:46:54,362][main][INFO] - [T_total=03:18:58 | T_train=03:07:53] Start epoch 3 |
|
|
[2025-10-24 15:49:29,065][main][INFO] - [T_total=04:21:33 | T_train=04:10:27 | T_epoch=01:02:34] End of epoch 3 (26664 steps) train loss 0.268908 |
|
|
[2025-10-24 15:49:29,066][main][INFO] - [Epoch 3] All losses: [[diffusion=0.0635387 ; kl=3692.76 ; lpips=0.146519 ; repa=0.513671]] |
|
|
[2025-10-24 15:52:56,207][main][INFO] - [Epoch 4] Test metrics: [[MSE=21.69 | MAE=0.112 | LPIPS=0.127 | PSNR=16.64 | SSIM=0.6152 | dreamsim=0.1867 | FID=10.75]] |
|
|
[2025-10-24 15:52:56,209][main][INFO] - [Epoch 4] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.127 | max_PSNR=18.49 | max_SSIM=0.6156 | min_dreamsim=0.1867 | min_FID=10.75]] |
|
|
[2025-10-24 15:52:56,210][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 15:52:57,298][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 15:52:57,498][main][INFO] - End of epoch timers: [T_train=04:10:27 | T_epoch=01:02:34 | T_eval=00:13:55 | T_total=04:25:01] |
|
|
[2025-10-24 15:52:57,500][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 15:52:59,857][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 15:53:02,578][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 15:53:02,579][main][INFO] - [T_total=04:25:06 | T_train=04:10:27] Start epoch 4 |
|
|
[2025-10-24 16:55:38,098][main][INFO] - [T_total=05:27:42 | T_train=05:13:03 | T_epoch=01:02:35] End of epoch 4 (33330 steps) train loss 0.262561 |
|
|
[2025-10-24 16:55:38,102][main][INFO] - [Epoch 4] All losses: [[diffusion=0.06292 ; kl=3688.83 ; lpips=0.141097 ; repa=0.501614]] |
|
|
[2025-10-24 16:59:05,267][main][INFO] - [Epoch 5] Test metrics: [[MSE=21.7 | MAE=0.1119 | LPIPS=0.1238 | PSNR=16.64 | SSIM=0.6186 | dreamsim=0.1799 | FID=9.549]] |
|
|
[2025-10-24 16:59:05,270][main][INFO] - [Epoch 5] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1238 | max_PSNR=18.49 | max_SSIM=0.6186 | min_dreamsim=0.1799 | min_FID=9.549]] |
|
|
[2025-10-24 16:59:05,271][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 16:59:06,351][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 16:59:06,591][main][INFO] - End of epoch timers: [T_train=05:13:03 | T_epoch=01:02:35 | T_eval=00:17:23 | T_total=05:31:10] |
|
|
[2025-10-24 16:59:06,592][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 16:59:09,275][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 16:59:11,878][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 16:59:11,879][main][INFO] - [T_total=05:31:16 | T_train=05:13:03] Start epoch 5 |
|
|
[2025-10-24 18:01:46,540][main][INFO] - [T_total=06:33:50 | T_train=06:15:37 | T_epoch=01:02:34] End of epoch 5 (39996 steps) train loss 0.257655 |
|
|
[2025-10-24 18:01:46,542][main][INFO] - [Epoch 5] All losses: [[diffusion=0.0621701 ; kl=3687.45 ; lpips=0.137338 ; repa=0.492512]] |
|
|
[2025-10-24 18:05:13,288][main][INFO] - [Epoch 6] Test metrics: [[MSE=21.93 | MAE=0.1125 | LPIPS=0.1213 | PSNR=16.59 | SSIM=0.6218 | dreamsim=0.1746 | FID=8.68]] |
|
|
[2025-10-24 18:05:13,290][main][INFO] - [Epoch 6] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1213 | max_PSNR=18.49 | max_SSIM=0.6218 | min_dreamsim=0.1746 | min_FID=8.68]] |
|
|
[2025-10-24 18:05:13,291][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 18:05:14,398][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 18:05:14,601][main][INFO] - End of epoch timers: [T_train=06:15:37 | T_epoch=01:02:34 | T_eval=00:20:51 | T_total=06:37:18] |
|
|
[2025-10-24 18:05:14,604][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 18:05:17,445][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 18:05:19,868][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 18:05:19,869][main][INFO] - [T_total=06:37:24 | T_train=06:15:37] Start epoch 6 |
|
|
[2025-10-24 19:07:56,898][main][INFO] - [T_total=07:40:01 | T_train=07:18:14 | T_epoch=01:02:37] End of epoch 6 (46662 steps) train loss 0.253725 |
|
|
[2025-10-24 19:07:56,900][main][INFO] - [Epoch 6] All losses: [[diffusion=0.0615326 ; kl=3688.36 ; lpips=0.134359 ; repa=0.485297]] |
|
|
[2025-10-24 19:11:24,094][main][INFO] - [Epoch 7] Test metrics: [[MSE=22.28 | MAE=0.1135 | LPIPS=0.1196 | PSNR=16.52 | SSIM=0.624 | dreamsim=0.1707 | FID=8.082]] |
|
|
[2025-10-24 19:11:24,096][main][INFO] - [Epoch 7] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1196 | max_PSNR=18.49 | max_SSIM=0.624 | min_dreamsim=0.1707 | min_FID=8.082]] |
|
|
[2025-10-24 19:11:24,097][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 19:11:25,201][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 19:11:25,400][main][INFO] - End of epoch timers: [T_train=07:18:14 | T_epoch=01:02:37 | T_eval=00:24:19 | T_total=07:43:29] |
|
|
[2025-10-24 19:11:25,403][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 19:11:28,161][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 19:11:30,853][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 19:11:30,853][main][INFO] - [T_total=07:43:35 | T_train=07:18:14] Start epoch 7 |
|
|
[2025-10-24 20:14:06,645][main][INFO] - [T_total=08:46:10 | T_train=08:20:50 | T_epoch=01:02:35] End of epoch 7 (53328 steps) train loss 0.250756 |
|
|
[2025-10-24 20:14:06,647][main][INFO] - [Epoch 7] All losses: [[diffusion=0.06134 ; kl=3691.13 ; lpips=0.131829 ; repa=0.479243]] |
|
|
[2025-10-24 20:17:33,486][main][INFO] - [Epoch 8] Test metrics: [[MSE=22.07 | MAE=0.1128 | LPIPS=0.1169 | PSNR=16.56 | SSIM=0.6267 | dreamsim=0.1663 | FID=7.509]] |
|
|
[2025-10-24 20:17:33,488][main][INFO] - [Epoch 8] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1169 | max_PSNR=18.49 | max_SSIM=0.6267 | min_dreamsim=0.1663 | min_FID=7.509]] |
|
|
[2025-10-24 20:17:33,489][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 20:17:34,577][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 20:17:34,803][main][INFO] - End of epoch timers: [T_train=08:20:50 | T_epoch=01:02:35 | T_eval=00:27:47 | T_total=08:49:39] |
|
|
[2025-10-24 20:17:34,804][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 20:17:37,556][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 20:17:40,188][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 20:17:40,189][main][INFO] - [T_total=08:49:44 | T_train=08:20:50] Start epoch 8 |
|
|
[2025-10-24 21:20:17,007][main][INFO] - [T_total=09:52:21 | T_train=09:23:27 | T_epoch=01:02:36] End of epoch 8 (59994 steps) train loss 0.248044 |
|
|
[2025-10-24 21:20:17,008][main][INFO] - [Epoch 8] All losses: [[diffusion=0.0607502 ; kl=3693.54 ; lpips=0.130101 ; repa=0.474199]] |
|
|
[2025-10-24 21:23:44,408][main][INFO] - [Epoch 9] Test metrics: [[MSE=21.7 | MAE=0.1117 | LPIPS=0.1145 | PSNR=16.64 | SSIM=0.6294 | dreamsim=0.1627 | FID=7.034]] |
|
|
[2025-10-24 21:23:44,410][main][INFO] - [Epoch 9] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1145 | max_PSNR=18.49 | max_SSIM=0.6294 | min_dreamsim=0.1627 | min_FID=7.034]] |
|
|
[2025-10-24 21:23:44,411][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 21:23:45,508][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 21:23:45,708][main][INFO] - End of epoch timers: [T_train=09:23:27 | T_epoch=01:02:36 | T_eval=00:31:16 | T_total=09:55:50] |
|
|
[2025-10-24 21:23:45,709][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 21:23:48,374][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 21:23:51,020][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 21:23:51,020][main][INFO] - [T_total=09:55:55 | T_train=09:23:27] Start epoch 9 |
|
|
[2025-10-24 22:26:26,704][main][INFO] - [T_total=10:58:31 | T_train=10:26:03 | T_epoch=01:02:35] End of epoch 9 (66660 steps) train loss 0.245806 |
|
|
[2025-10-24 22:26:26,706][main][INFO] - [Epoch 9] All losses: [[diffusion=0.0604599 ; kl=3695.63 ; lpips=0.128417 ; repa=0.469767]] |
|
|
[2025-10-24 22:29:54,018][main][INFO] - [Epoch 10] Test metrics: [[MSE=21.54 | MAE=0.1112 | LPIPS=0.1129 | PSNR=16.67 | SSIM=0.6308 | dreamsim=0.1598 | FID=6.658]] |
|
|
[2025-10-24 22:29:54,020][main][INFO] - [Epoch 10] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1129 | max_PSNR=18.49 | max_SSIM=0.6308 | min_dreamsim=0.1598 | min_FID=6.658]] |
|
|
[2025-10-24 22:29:54,021][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 22:29:55,134][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 22:29:55,337][main][INFO] - End of epoch timers: [T_train=10:26:03 | T_epoch=01:02:35 | T_eval=00:34:44 | T_total=11:01:59] |
|
|
[2025-10-24 22:29:55,338][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 22:29:58,129][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 22:30:00,883][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 22:30:00,884][main][INFO] - [T_total=11:02:05 | T_train=10:26:03] Start epoch 10 |
|
|
[2025-10-24 23:32:38,551][main][INFO] - [T_total=12:04:42 | T_train=11:28:40 | T_epoch=01:02:37] End of epoch 10 (73326 steps) train loss 0.243893 |
|
|
[2025-10-24 23:32:38,553][main][INFO] - [Epoch 10] All losses: [[diffusion=0.0602009 ; kl=3698.58 ; lpips=0.126997 ; repa=0.465981]] |
|
|
[2025-10-24 23:36:06,224][main][INFO] - [Epoch 11] Test metrics: [[MSE=21.29 | MAE=0.1104 | LPIPS=0.1112 | PSNR=16.72 | SSIM=0.6335 | dreamsim=0.1568 | FID=6.331]] |
|
|
[2025-10-24 23:36:06,230][main][INFO] - [Epoch 11] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1112 | max_PSNR=18.49 | max_SSIM=0.6335 | min_dreamsim=0.1568 | min_FID=6.331]] |
|
|
[2025-10-24 23:36:06,231][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-24 23:36:07,086][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-24 23:36:07,296][main][INFO] - End of epoch timers: [T_train=11:28:40 | T_epoch=01:02:37 | T_eval=00:38:13 | T_total=12:08:11] |
|
|
[2025-10-24 23:36:07,298][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-24 23:36:10,288][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-24 23:36:12,899][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-24 23:36:12,900][main][INFO] - [T_total=12:08:17 | T_train=11:28:40] Start epoch 11 |
|
|
[2025-10-25 00:38:51,954][main][INFO] - [T_total=13:10:56 | T_train=12:31:19 | T_epoch=01:02:39] End of epoch 11 (79992 steps) train loss 0.242062 |
|
|
[2025-10-25 00:38:51,955][main][INFO] - [Epoch 11] All losses: [[diffusion=0.0598045 ; kl=3702.03 ; lpips=0.125852 ; repa=0.46252]] |
|
|
[2025-10-25 00:42:19,563][main][INFO] - [Epoch 12] Test metrics: [[MSE=21.05 | MAE=0.1097 | LPIPS=0.1098 | PSNR=16.77 | SSIM=0.6344 | dreamsim=0.1546 | FID=6.035]] |
|
|
[2025-10-25 00:42:19,565][main][INFO] - [Epoch 12] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1098 | max_PSNR=18.49 | max_SSIM=0.6344 | min_dreamsim=0.1546 | min_FID=6.035]] |
|
|
[2025-10-25 00:42:19,566][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-25 00:42:20,661][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-25 00:42:20,894][main][INFO] - End of epoch timers: [T_train=12:31:19 | T_epoch=01:02:39 | T_eval=00:41:42 | T_total=13:14:25] |
|
|
[2025-10-25 00:42:20,895][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-25 00:42:23,582][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-25 00:42:26,171][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-25 00:42:26,172][main][INFO] - [T_total=13:14:30 | T_train=12:31:19] Start epoch 12 |
|
|
[2025-10-25 01:45:03,014][main][INFO] - [T_total=14:17:07 | T_train=13:33:56 | T_epoch=01:02:36] End of epoch 12 (86658 steps) train loss 0.240598 |
|
|
[2025-10-25 01:45:03,015][main][INFO] - [Epoch 12] All losses: [[diffusion=0.0596262 ; kl=3704.82 ; lpips=0.124782 ; repa=0.459504]] |
|
|
[2025-10-25 01:48:30,676][main][INFO] - [Epoch 13] Test metrics: [[MSE=21.07 | MAE=0.1098 | LPIPS=0.1087 | PSNR=16.76 | SSIM=0.6359 | dreamsim=0.1527 | FID=5.793]] |
|
|
[2025-10-25 01:48:30,678][main][INFO] - [Epoch 13] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1087 | max_PSNR=18.49 | max_SSIM=0.6359 | min_dreamsim=0.1527 | min_FID=5.793]] |
|
|
[2025-10-25 01:48:30,679][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-25 01:48:31,757][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-25 01:48:31,960][main][INFO] - End of epoch timers: [T_train=13:33:56 | T_epoch=01:02:36 | T_eval=00:45:11 | T_total=14:20:36] |
|
|
[2025-10-25 01:48:31,961][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-25 01:48:34,485][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-25 01:48:37,247][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-25 01:48:37,248][main][INFO] - [T_total=14:20:41 | T_train=13:33:56] Start epoch 13 |
|
|
[2025-10-25 02:51:12,948][main][INFO] - [T_total=15:23:17 | T_train=14:36:32 | T_epoch=01:02:35] End of epoch 13 (93324 steps) train loss 0.239412 |
|
|
[2025-10-25 02:51:12,949][main][INFO] - [Epoch 13] All losses: [[diffusion=0.0596692 ; kl=3706.94 ; lpips=0.123694 ; repa=0.456758]] |
|
|
[2025-10-25 02:54:40,598][main][INFO] - [Epoch 14] Test metrics: [[MSE=20.86 | MAE=0.1092 | LPIPS=0.1076 | PSNR=16.81 | SSIM=0.6381 | dreamsim=0.1507 | FID=5.573]] |
|
|
[2025-10-25 02:54:40,600][main][INFO] - [Epoch 14] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1076 | max_PSNR=18.49 | max_SSIM=0.6381 | min_dreamsim=0.1507 | min_FID=5.573]] |
|
|
[2025-10-25 02:54:40,605][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-25 02:54:41,487][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-25 02:54:41,728][main][INFO] - End of epoch timers: [T_train=14:36:32 | T_epoch=01:02:35 | T_eval=00:48:39 | T_total=15:26:46] |
|
|
[2025-10-25 02:54:41,731][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-25 02:54:45,053][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-25 02:54:47,715][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-25 02:54:47,717][main][INFO] - [T_total=15:26:52 | T_train=14:36:32] Start epoch 14 |
|
|
[2025-10-25 03:57:24,404][main][INFO] - [T_total=16:29:28 | T_train=15:39:09 | T_epoch=01:02:36] End of epoch 14 (99990 steps) train loss 0.238048 |
|
|
[2025-10-25 03:57:24,406][main][INFO] - [Epoch 14] All losses: [[diffusion=0.0592936 ; kl=3709.87 ; lpips=0.122931 ; repa=0.454315]] |
|
|
[2025-10-25 04:00:51,619][main][INFO] - [Epoch 15] Test metrics: [[MSE=20.7 | MAE=0.1087 | LPIPS=0.1065 | PSNR=16.84 | SSIM=0.6397 | dreamsim=0.149 | FID=5.367]] |
|
|
[2025-10-25 04:00:51,621][main][INFO] - [Epoch 15] Best metrics: [[min_MSE=14.16 | min_MAE=0.0884 | min_LPIPS=0.1065 | max_PSNR=18.49 | max_SSIM=0.6397 | min_dreamsim=0.149 | min_FID=5.367]] |
|
|
[2025-10-25 04:00:51,622][main][DEBUG] - Writing images to disk... |
|
|
[2025-10-25 04:00:52,707][main][DEBUG] - Image(s) saved on disk |
|
|
[2025-10-25 04:00:52,907][main][INFO] - End of epoch timers: [T_train=15:39:09 | T_epoch=01:02:36 | T_eval=00:52:07 | T_total=16:32:57] |
|
|
[2025-10-25 04:00:52,908][main][INFO] - Storing model checkpoint inside /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/last |
|
|
[2025-10-25 04:00:55,517][main][INFO] - Best FID so far, storing a copy of the model checkpoint to /workspace/DC_SSDAE/runs/jobs/train_enc_vq_f8c4_FM/checkpoints/best |
|
|
[2025-10-25 04:00:57,799][main][INFO] - --- |
|
|
|
|
|
|
|
|
[2025-10-25 04:00:57,800][main][INFO] - [T_total=16:33:02 | T_train=15:39:09] Start epoch 15 |
|
|
|