Edit model card

T6

This model is a fine-tuned version of eslamxm/mt5-base-finetuned-arur on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5941

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 64

Training results

Training Loss Epoch Step Validation Loss
0.2591 1.0 37 0.2616
0.1639 2.0 74 0.2497
0.1771 3.0 111 0.2448
0.1465 4.0 148 0.2486
0.1294 5.0 185 0.2499
0.118 6.0 222 0.2520
0.1014 7.0 259 0.2582
0.0986 8.0 296 0.2631
0.1021 9.0 333 0.2775
0.0783 10.0 370 0.2867
0.0699 11.0 407 0.2906
0.062 12.0 444 0.3010
0.059 13.0 481 0.3144
0.0592 14.0 518 0.3265
0.0513 15.0 555 0.3365
0.0404 16.0 592 0.3550
0.0417 17.0 629 0.3552
0.0385 18.0 666 0.3682
0.0303 19.0 703 0.3728
0.0355 20.0 740 0.3947
0.0232 21.0 777 0.4208
0.024 22.0 814 0.4080
0.023 23.0 851 0.4265
0.0169 24.0 888 0.4233
0.0185 25.0 925 0.4450
0.0214 26.0 962 0.4528
0.0159 27.0 999 0.4486
0.0156 28.0 1036 0.4926
0.017 29.0 1073 0.4927
0.0137 30.0 1110 0.4886
0.0139 31.0 1147 0.5205
0.0108 32.0 1184 0.4953
0.0136 33.0 1221 0.4925
0.0129 34.0 1258 0.5081
0.0099 35.0 1295 0.5252
0.0116 36.0 1332 0.5241
0.0134 37.0 1369 0.5352
0.0111 38.0 1406 0.5469
0.0089 39.0 1443 0.5618
0.0103 40.0 1480 0.5781
0.0083 41.0 1517 0.5896
0.0091 42.0 1554 0.5287
0.0115 43.0 1591 0.5556
0.0069 44.0 1628 0.5497
0.0069 45.0 1665 0.5896
0.0089 46.0 1702 0.5799
0.0056 47.0 1739 0.5654
0.0072 48.0 1776 0.5683
0.0097 49.0 1813 0.5642
0.0065 50.0 1850 0.5623
0.0073 51.0 1887 0.5906
0.0078 52.0 1924 0.5932
0.0068 53.0 1961 0.5923
0.006 54.0 1998 0.5978
0.005 55.0 2035 0.5846
0.0082 56.0 2072 0.5886
0.0081 57.0 2109 0.5844
0.0056 58.0 2146 0.5878
0.0069 59.0 2183 0.5890
0.0075 60.0 2220 0.5946
0.0077 61.0 2257 0.5897
0.0064 62.0 2294 0.5908
0.0049 63.0 2331 0.5934
0.005 64.0 2368 0.5941

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.1
Downloads last month
1
Safetensors
Model size
582M params
Tensor type
F32
·

Finetuned from