bbunijieun's picture
Initial model training
fbca165 verified
|
raw
history blame
4.54 kB
metadata
tags:
  - generated_from_trainer
model-index:
  - name: seq2seq_huggingface_mix_results
    results: []

seq2seq_huggingface_mix_results

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 7.0175

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 12
  • eval_batch_size: 12
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 48
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
10.5072 0.0480 10 10.4491
10.3574 0.0959 20 10.1991
10.0831 0.1439 30 9.8790
9.7946 0.1918 40 9.5780
9.5118 0.2398 50 9.3344
9.3333 0.2878 60 9.1722
9.1888 0.3357 70 9.0610
9.0913 0.3837 80 8.9742
9.0007 0.4317 90 8.9005
8.9134 0.4796 100 8.8328
8.8583 0.5276 110 8.7615
8.7722 0.5755 120 8.6873
8.7092 0.6235 130 8.6137
8.6223 0.6715 140 8.5340
8.5312 0.7194 150 8.4538
8.4582 0.7674 160 8.3681
8.3748 0.8153 170 8.2801
8.2637 0.8633 180 8.1936
8.1704 0.9113 190 8.1001
8.0697 0.9592 200 8.0079
7.9792 1.0072 210 7.9126
7.9 1.0552 220 7.8175
7.8134 1.1031 230 7.7236
7.7153 1.1511 240 7.6328
7.6087 1.1990 250 7.5477
7.5328 1.2470 260 7.4634
7.4347 1.2950 270 7.3862
7.3531 1.3429 280 7.3179
7.3059 1.3909 290 7.2513
7.2403 1.4388 300 7.1955
7.2128 1.4868 310 7.1506
7.1508 1.5348 320 7.1105
7.1104 1.5827 330 7.0835
7.067 1.6307 340 7.0655
7.0594 1.6787 350 7.0558
7.0591 1.7266 360 7.0411
7.0129 1.7746 370 7.0381
7.0107 1.8225 380 7.0344
7.0549 1.8705 390 7.0268
7.0358 1.9185 400 7.0249
7.0395 1.9664 410 7.0242
7.0105 2.0144 420 7.0215
7.0113 2.0624 430 7.0259
6.9985 2.1103 440 7.0213
7.0218 2.1583 450 7.0218
6.9735 2.2062 460 7.0275
7.0132 2.2542 470 7.0254
7.0241 2.3022 480 7.0219
7.0127 2.3501 490 7.0238
6.9644 2.3981 500 7.0249
7.0103 2.4460 510 7.0259
7.006 2.4940 520 7.0266
6.9882 2.5420 530 7.0235
7.0016 2.5899 540 7.0235
7.002 2.6379 550 7.0217
6.9782 2.6859 560 7.0196
6.9833 2.7338 570 7.0198
6.9967 2.7818 580 7.0202
6.9644 2.8297 590 7.0196
6.9825 2.8777 600 7.0199
7.0097 2.9257 610 7.0178
6.9909 2.9736 620 7.0175

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1