Edit model card

cs_mT5_0.01_100_v0.1

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 7.2340
  • Bleu: 0.3036
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
6.6837 1.0 6 11.3036 0.2013 19.0
8.0402 2.0 12 7.6042 0.0 19.0
5.7261 3.0 18 6.2715 0.1442 19.0
5.3416 4.0 24 6.1049 0.0 19.0
5.8512 5.0 30 5.9636 0.7212 19.0
5.5121 6.0 36 6.1024 0.0 19.0
5.4593 7.0 42 5.8880 0.0 19.0
5.9207 8.0 48 5.7795 0.1689 19.0
4.8391 9.0 54 5.8944 0.7212 19.0
5.0744 10.0 60 5.8163 0.7212 19.0
4.3596 11.0 66 5.6633 0.7212 19.0
4.4986 12.0 72 5.7046 0.7212 19.0
3.7398 13.0 78 5.7036 0.1689 19.0
4.3772 14.0 84 5.6193 0.7212 19.0
4.3643 15.0 90 5.6598 0.7212 19.0
4.1574 16.0 96 5.7247 0.7212 19.0
4.1304 17.0 102 5.7906 0.7212 19.0
4.1503 18.0 108 5.6421 0.1689 19.0
4.769 19.0 114 5.5631 0.7212 19.0
4.7648 20.0 120 5.9913 0.0 19.0
4.076 21.0 126 5.8300 0.7212 19.0
4.5435 22.0 132 5.7988 0.7212 19.0
4.2224 23.0 138 5.7900 0.7212 19.0
3.7953 24.0 144 6.0687 0.011 19.0
4.0312 25.0 150 5.8321 0.1689 19.0
3.4781 26.0 156 5.8820 0.7984 19.0
4.0509 27.0 162 5.9177 0.7212 19.0
3.8217 28.0 168 5.7663 0.7861 19.0
4.1972 29.0 174 6.0547 0.9173 19.0
3.9588 30.0 180 5.7790 0.7212 19.0
3.8624 31.0 186 5.8604 0.1916 19.0
3.7053 32.0 192 5.9171 0.7212 19.0
4.03 33.0 198 5.8490 0.7212 19.0
3.3214 34.0 204 6.3967 0.7212 19.0
3.8343 35.0 210 5.7936 0.0 19.0
3.3124 36.0 216 5.8793 0.7663 19.0
3.7071 37.0 222 6.1326 0.0957 8.0
3.6547 38.0 228 5.9072 0.8029 19.0
3.4187 39.0 234 5.8807 0.5047 19.0
3.953 40.0 240 5.8663 0.7923 19.0
4.0113 41.0 246 6.1256 0.7212 19.0
4.2969 42.0 252 6.0113 0.1689 19.0
3.9081 43.0 258 5.9222 0.0 15.9048
3.7646 44.0 264 5.9990 0.7212 19.0
3.5407 45.0 270 6.2920 0.0945 7.0
2.8075 46.0 276 6.1092 0.4815 19.0
3.9057 47.0 282 6.1175 1.0006 19.0
4.1845 48.0 288 6.2553 0.8147 19.0
3.4686 49.0 294 6.1979 0.7796 19.0
3.029 50.0 300 6.1064 0.7771 19.0
3.62 51.0 306 5.9443 0.7212 19.0
3.719 52.0 312 6.3162 0.7212 19.0
3.4713 53.0 318 5.9465 0.7212 19.0
3.675 54.0 324 6.1606 0.3501 19.0
3.518 55.0 330 6.1223 0.1689 19.0
3.3729 56.0 336 6.0394 1.3618 19.0
2.7827 57.0 342 6.3169 0.7212 19.0
3.7061 58.0 348 6.4504 1.694 19.0
3.4929 59.0 354 6.3042 0.7475 19.0
2.1424 60.0 360 6.3536 0.8628 19.0
2.787 61.0 366 6.3339 0.0 19.0
3.6486 62.0 372 6.4380 0.1023 19.0
3.8631 63.0 378 6.3261 0.7212 19.0
3.4476 64.0 384 6.2478 1.2825 19.0
3.256 65.0 390 6.4766 0.5017 19.0
3.6114 66.0 396 6.4519 0.7212 19.0
3.8405 67.0 402 6.3538 0.4744 19.0
3.3164 68.0 408 6.0134 0.3725 19.0
3.4129 69.0 414 6.5988 0.2135 19.0
3.693 70.0 420 6.4498 0.1689 19.0
2.9521 71.0 426 6.2916 1.3636 19.0
3.6362 72.0 432 6.3040 0.3063 19.0
3.6713 73.0 438 6.3731 0.8106 19.0
3.2562 74.0 444 6.3822 0.9407 19.0
2.4132 75.0 450 6.5435 0.9407 19.0
3.4504 76.0 456 6.7828 0.8829 19.0
3.282 77.0 462 6.6479 1.4788 19.0
3.4199 78.0 468 6.6536 0.0761 6.0
3.4234 79.0 474 6.5193 0.4172 19.0
3.0937 80.0 480 6.7476 0.5603 19.0
2.9563 81.0 486 6.6885 1.5178 19.0
3.1052 82.0 492 6.6320 1.3064 19.0
2.7674 83.0 498 6.6363 0.7892 19.0
2.6265 84.0 504 6.6629 1.5199 19.0
2.3116 85.0 510 6.6467 0.0 19.0
3.0439 86.0 516 6.7820 0.9326 19.0
2.7406 87.0 522 6.9067 1.2025 19.0
2.4509 88.0 528 6.9738 1.0657 19.0
2.8186 89.0 534 7.1507 0.4574 19.0
2.6713 90.0 540 7.0799 0.4527 19.0
2.6231 91.0 546 7.0459 0.646 19.0
3.2357 92.0 552 7.0238 0.525 19.0
2.8834 93.0 558 7.0185 0.5206 19.0
1.7973 94.0 564 7.0711 0.8153 19.0
1.9995 95.0 570 7.1263 0.3015 19.0
2.2875 96.0 576 7.1877 0.3025 19.0
1.8547 97.0 582 7.2062 0.3025 19.0
1.5572 98.0 588 7.2270 0.5076 19.0
1.7653 99.0 594 7.2347 0.3025 19.0
2.6411 100.0 600 7.2340 0.3036 19.0

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
1
Safetensors
Model size
582M params
Tensor type
F32
·

Finetuned from