Edit model card

cs_mT5_0.01_50_v0.2

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.5103
  • Bleu: 0.6102
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
7.711 1.0 6 12.1049 0.1934 17.0952
9.9285 2.0 12 7.2425 0.1897 19.0
6.1686 3.0 18 6.5311 0.3794 19.0
6.0527 4.0 24 6.4394 0.6102 19.0
6.0341 5.0 30 6.3821 0.6102 19.0
4.9353 6.0 36 6.2037 0.704 19.0
5.3025 7.0 42 6.2245 0.2318 19.0
5.4882 8.0 48 6.4244 0.0 19.0
4.3601 9.0 54 6.4695 0.0 19.0
4.8256 10.0 60 6.1939 0.6102 19.0
6.2603 11.0 66 7.2294 0.6102 19.0
5.5046 12.0 72 7.1054 0.704 19.0
5.6536 13.0 78 6.2424 0.6102 19.0
5.2092 14.0 84 6.2343 0.2016 19.0
4.5288 15.0 90 6.1996 0.6102 19.0
5.3447 16.0 96 6.4456 0.0 3.0
4.7282 17.0 102 6.0271 0.2172 19.0
5.3814 18.0 108 6.2591 0.6102 19.0
3.8156 19.0 114 6.2314 0.6102 19.0
4.9031 20.0 120 6.3173 0.1986 19.0
4.4266 21.0 126 6.5376 0.6102 19.0
4.1837 22.0 132 6.1329 0.6405 19.0
4.2994 23.0 138 6.1589 0.6102 19.0
4.3625 24.0 144 6.0873 0.6218 19.0
4.8956 25.0 150 6.2374 0.2318 19.0
4.3551 26.0 156 6.2230 0.6102 19.0
4.025 27.0 162 6.2925 0.6929 19.0
3.9071 28.0 168 6.2040 0.6102 19.0
4.1388 29.0 174 6.3024 0.6218 19.0
3.942 30.0 180 6.4055 0.2176 19.0
4.0977 31.0 186 6.3005 0.2874 19.0
4.6125 32.0 192 6.2886 0.6102 19.0
4.4031 33.0 198 6.2948 0.704 19.0
3.8115 34.0 204 6.2838 0.6102 19.0
4.8767 35.0 210 6.2233 0.6102 19.0
3.6557 36.0 216 6.2860 0.6102 19.0
3.9009 37.0 222 6.3841 0.6102 19.0
4.4334 38.0 228 6.3916 0.6102 19.0
3.3359 39.0 234 6.3696 0.6102 19.0
4.2239 40.0 240 6.4627 0.6102 19.0
3.7335 41.0 246 6.5189 0.6102 19.0
3.544 42.0 252 6.4958 0.6102 19.0
3.77 43.0 258 6.5922 0.6102 19.0
3.4661 44.0 264 6.6700 0.6218 19.0
4.0715 45.0 270 6.6247 0.6102 19.0
3.8948 46.0 276 6.5279 0.6102 19.0
3.8278 47.0 282 6.4594 0.6102 19.0
4.1446 48.0 288 6.4570 0.6102 19.0
3.7627 49.0 294 6.4923 0.6102 19.0
3.5996 50.0 300 6.5103 0.6102 19.0

Framework versions

  • Transformers 4.35.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.17.0
  • Tokenizers 0.15.1
Downloads last month
1
Safetensors
Model size
582M params
Tensor type
F32
·

Finetuned from