End of training
f7be570
verified
-
attn_layer_mapper=last, attn_loss_fn=mse, attn_weight=1.0, lr_scheduler_type=cosine, warmup_ratio=0.5
Training in progress, step 61875
-
attn_layer_mapper=layer-2, attn_loss_fn=cos, attn_weight=1.0, lr_scheduler_type=cosine, warmup_ratio=0.5
End of training
-
attn_layer_mapper=layer-2, attn_loss_fn=mse, attn_weight=1.0, lr_scheduler_type=cosine, warmup_ratio=0.5
Training in progress, step 61875
-
dataset_sample_size=1000000
End of training
-
lr_scheduler_type=cosine, warmup_ratio=0.5
Training in progress, step 61875
-
lr_scheduler_type=linear, warmup_ratio=0.5
Training in progress, step 61875
-
0 Bytes
Training in progress, step 61875
-
29.7 MB
End of training
-
588 Bytes
Training in progress, step 61875