Edit model card

cs_mT5-large2_0.01_50_v0.1

This model is a fine-tuned version of google/mt5-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 7.9807
  • Bleu: 0.6899
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
2.6703 1.0 6 8.3163 0.0 19.0
3.3427 2.0 12 6.5085 0.5004 19.0
2.8652 3.0 18 7.0200 0.6899 19.0
2.9454 4.0 24 7.3333 0.2191 19.0
2.7918 5.0 30 7.5745 0.4671 12.0
3.5645 6.0 36 6.3676 0.0 19.0
3.0885 7.0 42 7.0359 0.6908 19.0
3.5374 8.0 48 6.8709 0.1154 12.3333
3.3746 9.0 54 6.4090 0.0 19.0
2.5927 10.0 60 6.7357 0.6381 19.0
2.581 11.0 66 6.5953 0.2635 19.0
4.1786 12.0 72 7.8617 0.2068 19.0
2.8545 13.0 78 6.2553 0.1628 19.0
2.8925 14.0 84 6.6297 0.612 19.0
3.2424 15.0 90 7.0312 0.7377 19.0
2.379 16.0 96 6.9121 0.6562 19.0
2.2356 17.0 102 6.9446 0.2759 15.0
3.0548 18.0 108 7.5770 0.1529 19.0
2.4637 19.0 114 7.1444 0.4497 19.0
2.96 20.0 120 6.8181 0.3779 11.0
2.2016 21.0 126 6.6893 0.6562 19.0
1.9774 22.0 132 7.3802 0.3807 19.0
1.6734 23.0 138 6.7319 0.5405 19.0
3.1958 24.0 144 7.1645 1.2379 19.0
3.1363 25.0 150 7.7097 0.3794 19.0
2.4353 26.0 156 6.9324 0.2522 14.0
2.8675 27.0 162 6.7989 0.1488 19.0
1.7486 28.0 168 7.1052 0.7123 19.0
2.775 29.0 174 7.0195 0.7393 19.0
1.8752 30.0 180 6.9133 0.2119 19.0
1.7576 31.0 186 7.2143 0.2641 19.0
2.2793 32.0 192 7.0029 1.1166 19.0
1.98 33.0 198 6.9954 0.5348 19.0
1.4242 34.0 204 7.5163 0.2088 19.0
2.413 35.0 210 7.0622 0.1433 19.0
1.2191 36.0 216 7.0088 0.5307 12.0
1.5944 37.0 222 7.7706 0.1948 19.0
1.0044 38.0 228 7.7163 0.8485 18.4286
1.4428 39.0 234 7.4919 0.6033 19.0
3.0175 40.0 240 7.4158 0.5109 19.0
1.3632 41.0 246 6.9819 0.4326 11.0
1.8384 42.0 252 7.0156 0.6215 18.2381
1.3237 43.0 258 7.2082 0.4826 19.0
1.1516 44.0 264 7.5088 0.4745 18.8571
1.3893 45.0 270 7.7298 0.4527 19.0
1.0125 46.0 276 7.8458 0.832 14.0
0.8954 47.0 282 7.9101 0.7754 17.5714
1.8111 48.0 288 7.9713 0.6899 19.0
1.2008 49.0 294 7.9821 0.6899 19.0
1.5131 50.0 300 7.9807 0.6899 19.0

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
1.23B params
Tensor type
F32
·

Finetuned from