En-Nso_update / README.md
Kabelo Malapane
update model card README.md
347016c
|
raw
history blame
7.38 kB
metadata
license: apache-2.0
tags:
  - translation
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: En-Nso_update
    results: []

En-Nso_update

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-nso on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.8782
  • Bleu: 31.2967

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Bleu
No log 1.0 4 7.2950 0.0088
No log 2.0 8 5.9614 0.6848
No log 3.0 12 5.0695 4.9050
No log 4.0 16 4.5523 9.1757
No log 5.0 20 4.2355 10.4744
No log 6.0 24 4.0106 14.6163
No log 7.0 28 3.8427 15.8379
No log 8.0 32 3.7264 15.6158
No log 9.0 36 3.6338 16.3562
No log 10.0 40 3.5555 21.1011
No log 11.0 44 3.4839 21.5754
No log 12.0 48 3.4180 22.7155
No log 13.0 52 3.3620 23.1592
No log 14.0 56 3.3115 24.3886
No log 15.0 60 3.2676 24.1278
No log 16.0 64 3.2285 24.2245
No log 17.0 68 3.1974 23.9716
No log 18.0 72 3.1695 24.2395
No log 19.0 76 3.1441 23.3442
No log 20.0 80 3.1235 21.3332
No log 21.0 84 3.1029 21.8410
No log 22.0 88 3.0849 22.4065
No log 23.0 92 3.0666 22.3016
No log 24.0 96 3.0534 22.9616
No log 25.0 100 3.0423 23.3971
No log 26.0 104 3.0306 23.5443
No log 27.0 108 3.0183 23.3348
No log 28.0 112 3.0051 23.4077
No log 29.0 116 2.9947 24.1791
No log 30.0 120 2.9855 24.1265
No log 31.0 124 2.9777 23.9860
No log 32.0 128 2.9691 24.7301
No log 33.0 132 2.9597 25.1896
No log 34.0 136 2.9521 24.5893
No log 35.0 140 2.9457 24.5229
No log 36.0 144 2.9409 24.6232
No log 37.0 148 2.9354 24.2830
No log 38.0 152 2.9322 26.1404
No log 39.0 156 2.9306 25.9425
No log 40.0 160 2.9288 30.5432
No log 41.0 164 2.9261 29.4635
No log 42.0 168 2.9215 28.4787
No log 43.0 172 2.9182 28.9082
No log 44.0 176 2.9151 29.3171
No log 45.0 180 2.9132 28.3602
No log 46.0 184 2.9126 28.9583
No log 47.0 188 2.9104 26.0269
No log 48.0 192 2.9086 29.6904
No log 49.0 196 2.9052 29.2881
No log 50.0 200 2.9020 29.6063
No log 51.0 204 2.8994 29.5224
No log 52.0 208 2.8960 29.3913
No log 53.0 212 2.8930 30.5451
No log 54.0 216 2.8889 32.1862
No log 55.0 220 2.8869 31.9423
No log 56.0 224 2.8859 30.7244
No log 57.0 228 2.8846 30.8172
No log 58.0 232 2.8837 30.5376
No log 59.0 236 2.8826 31.1454
No log 60.0 240 2.8813 30.9049
No log 61.0 244 2.8802 30.6363
No log 62.0 248 2.8802 31.3739
No log 63.0 252 2.8799 30.9776
No log 64.0 256 2.8793 29.8283
No log 65.0 260 2.8795 29.6912
No log 66.0 264 2.8804 29.7654
No log 67.0 268 2.8810 29.1586
No log 68.0 272 2.8822 28.8888
No log 69.0 276 2.8819 29.7222
No log 70.0 280 2.8810 29.9932
No log 71.0 284 2.8811 30.2492
No log 72.0 288 2.8802 29.9644
No log 73.0 292 2.8791 30.3378
No log 74.0 296 2.8790 29.8055
No log 75.0 300 2.8794 29.0100
No log 76.0 304 2.8795 30.7968
No log 77.0 308 2.8790 31.5414
No log 78.0 312 2.8783 31.5060
No log 79.0 316 2.8775 31.4376
No log 80.0 320 2.8766 31.6005
No log 81.0 324 2.8767 31.3697
No log 82.0 328 2.8769 31.6108
No log 83.0 332 2.8770 31.4214
No log 84.0 336 2.8772 31.6039
No log 85.0 340 2.8776 32.0254
No log 86.0 344 2.8779 31.4024
No log 87.0 348 2.8783 32.0279
No log 88.0 352 2.8786 31.8914
No log 89.0 356 2.8788 31.6500
No log 90.0 360 2.8791 31.7698
No log 91.0 364 2.8793 31.6137
No log 92.0 368 2.8793 31.8244
No log 93.0 372 2.8790 31.5626
No log 94.0 376 2.8786 31.3743
No log 95.0 380 2.8785 31.4160
No log 96.0 384 2.8784 31.6682
No log 97.0 388 2.8782 31.8335
No log 98.0 392 2.8782 31.7143
No log 99.0 396 2.8782 31.7143
No log 100.0 400 2.8782 31.7143

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.12.0+cu113
  • Datasets 2.3.2
  • Tokenizers 0.12.1