marianmt-finetuned-netspeak-tgl-to-eng
This model is a fine-tuned version of Helsinki-NLP/opus-mt-tl-en on an unknown dataset. It achieves the following results on the evaluation set:
- Train Loss: 0.7277
- Validation Loss: 2.0459
- Train Bleu: 33.5501
- Train Gen Len: 8.7228
- Epoch: 93
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-06, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
Training results
Train Loss | Validation Loss | Train Bleu | Train Gen Len | Epoch |
---|---|---|---|---|
5.4267 | 4.5907 | 3.3310 | 11.7129 | 0 |
4.6862 | 4.1720 | 3.5594 | 10.4752 | 1 |
4.4077 | 3.9852 | 4.0100 | 9.2079 | 2 |
4.2296 | 3.8554 | 3.3190 | 9.3663 | 3 |
4.0964 | 3.7598 | 4.8776 | 9.4554 | 4 |
3.9799 | 3.6710 | 4.9744 | 9.6931 | 5 |
3.8799 | 3.5953 | 5.9838 | 9.4752 | 6 |
3.7661 | 3.5248 | 6.4073 | 9.3366 | 7 |
3.6807 | 3.4588 | 6.2692 | 9.1485 | 8 |
3.5932 | 3.3964 | 6.0781 | 9.0990 | 9 |
3.5110 | 3.3384 | 6.6363 | 9.0891 | 10 |
3.4294 | 3.2892 | 7.0472 | 9.2079 | 11 |
3.3566 | 3.2363 | 7.2707 | 9.1782 | 12 |
3.2796 | 3.1878 | 7.9426 | 9.1683 | 13 |
3.2026 | 3.1376 | 7.9254 | 9.2772 | 14 |
3.1472 | 3.0926 | 8.2076 | 9.1188 | 15 |
3.0634 | 3.0496 | 8.5193 | 9.2475 | 16 |
3.0124 | 3.0082 | 8.9990 | 9.1485 | 17 |
2.9554 | 2.9696 | 11.2816 | 9.1485 | 18 |
2.8885 | 2.9352 | 12.0866 | 9.0396 | 19 |
2.8403 | 2.8974 | 12.8611 | 9.1485 | 20 |
2.7636 | 2.8661 | 13.0981 | 9.1485 | 21 |
2.7229 | 2.8269 | 12.9295 | 8.9010 | 22 |
2.6714 | 2.7951 | 14.0159 | 8.8713 | 23 |
2.6179 | 2.7644 | 13.7369 | 8.7624 | 24 |
2.5520 | 2.7348 | 14.0979 | 8.8119 | 25 |
2.5199 | 2.7059 | 14.5253 | 8.7426 | 26 |
2.4652 | 2.6832 | 13.8452 | 8.7030 | 27 |
2.4081 | 2.6537 | 15.6475 | 8.9505 | 28 |
2.3708 | 2.6302 | 16.1325 | 8.8713 | 29 |
2.3195 | 2.6124 | 16.0044 | 8.7426 | 30 |
2.2938 | 2.5892 | 16.8560 | 8.8020 | 31 |
2.2202 | 2.5700 | 16.8995 | 8.8911 | 32 |
2.1808 | 2.5456 | 17.5342 | 8.8416 | 33 |
2.1373 | 2.5262 | 18.4092 | 8.6337 | 34 |
2.1096 | 2.5082 | 18.1906 | 8.6436 | 35 |
2.0610 | 2.4896 | 18.3189 | 8.7525 | 36 |
2.0275 | 2.4725 | 18.4318 | 8.6436 | 37 |
1.9913 | 2.4534 | 18.1136 | 8.6832 | 38 |
1.9544 | 2.4403 | 19.2999 | 8.6040 | 39 |
1.9144 | 2.4220 | 19.1325 | 8.6535 | 40 |
1.8781 | 2.4075 | 19.4122 | 8.6337 | 41 |
1.8610 | 2.3928 | 21.0270 | 8.6832 | 42 |
1.8176 | 2.3779 | 20.9122 | 8.7921 | 43 |
1.7839 | 2.3618 | 20.3906 | 8.7624 | 44 |
1.7553 | 2.3466 | 20.9078 | 8.7327 | 45 |
1.7045 | 2.3368 | 20.7228 | 8.7030 | 46 |
1.6974 | 2.3221 | 20.7889 | 8.7426 | 47 |
1.6561 | 2.3109 | 20.8293 | 8.7129 | 48 |
1.6264 | 2.2991 | 20.3201 | 8.5644 | 49 |
1.5976 | 2.2906 | 22.7905 | 8.6139 | 50 |
1.5725 | 2.2820 | 23.9301 | 8.7228 | 51 |
1.5528 | 2.2702 | 23.5437 | 8.6733 | 52 |
1.5158 | 2.2612 | 22.9832 | 8.6040 | 53 |
1.4883 | 2.2509 | 24.6290 | 8.6733 | 54 |
1.4497 | 2.2434 | 25.6293 | 8.6139 | 55 |
1.4357 | 2.2336 | 25.4158 | 8.6634 | 56 |
1.4105 | 2.2290 | 25.2337 | 8.5644 | 57 |
1.3803 | 2.2194 | 26.2588 | 8.5941 | 58 |
1.3606 | 2.2118 | 25.8251 | 8.6139 | 59 |
1.3389 | 2.2073 | 26.2269 | 8.5842 | 60 |
1.3064 | 2.1966 | 26.2973 | 8.6040 | 61 |
1.2747 | 2.1893 | 27.3831 | 8.5743 | 62 |
1.2586 | 2.1811 | 28.4823 | 8.6733 | 63 |
1.2445 | 2.1740 | 27.5688 | 8.6139 | 64 |
1.2201 | 2.1576 | 29.3111 | 8.5347 | 65 |
1.1924 | 2.1487 | 28.3428 | 8.6040 | 66 |
1.1657 | 2.1464 | 28.8596 | 8.5941 | 67 |
1.1435 | 2.1469 | 28.7870 | 8.5743 | 68 |
1.1274 | 2.1382 | 29.5455 | 8.6436 | 69 |
1.1080 | 2.1297 | 29.4602 | 8.6139 | 70 |
1.0907 | 2.1257 | 28.2800 | 8.7525 | 71 |
1.0881 | 2.1207 | 29.2731 | 8.6337 | 72 |
1.0534 | 2.1179 | 29.9292 | 8.7624 | 73 |
1.0389 | 2.1096 | 29.9660 | 8.5347 | 74 |
1.0186 | 2.1052 | 29.7106 | 8.5446 | 75 |
0.9953 | 2.0959 | 30.0563 | 8.5050 | 76 |
0.9727 | 2.0977 | 30.0527 | 8.5446 | 77 |
0.9543 | 2.0878 | 29.8762 | 8.5446 | 78 |
0.9372 | 2.0871 | 30.4451 | 8.4950 | 79 |
0.9234 | 2.0804 | 30.7829 | 8.5347 | 80 |
0.9045 | 2.0774 | 31.2911 | 8.6337 | 81 |
0.8920 | 2.0727 | 31.4189 | 8.4752 | 82 |
0.8729 | 2.0761 | 30.5640 | 8.7624 | 83 |
0.8466 | 2.0735 | 31.4347 | 8.7525 | 84 |
0.8430 | 2.0677 | 31.1463 | 8.6139 | 85 |
0.8340 | 2.0669 | 31.5623 | 8.7228 | 86 |
0.8152 | 2.0587 | 31.9364 | 8.6535 | 87 |
0.7916 | 2.0548 | 31.6855 | 8.6238 | 88 |
0.7829 | 2.0562 | 33.4523 | 8.7426 | 89 |
0.7678 | 2.0559 | 32.0304 | 8.7129 | 90 |
0.7509 | 2.0540 | 32.7711 | 8.7525 | 91 |
0.7406 | 2.0498 | 33.6200 | 8.7030 | 92 |
0.7277 | 2.0459 | 33.5501 | 8.7228 | 93 |
Framework versions
- Transformers 4.25.1
- TensorFlow 2.9.2
- Datasets 2.8.0
- Tokenizers 0.13.2
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.