metadata
license: apache-2.0
base_model: Helsinki-NLP/opus-mt-id-en
tags:
- generated_from_keras_callback
model-index:
- name: aditnnda/machine_translation_informal2formal
results: []
aditnnda/machine_translation_informal2formal
This model is a fine-tuned version of Helsinki-NLP/opus-mt-id-en on STIF Indonesia dataset. It achieves the following results on the evaluation set:
- Train Loss: 0.0077
- Validation Loss: 1.2870
- Epoch: 99
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': 6000, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32
Training results
Train Loss | Validation Loss | Epoch |
---|---|---|
3.4298 | 2.4070 | 0 |
2.1508 | 1.8031 | 1 |
1.6301 | 1.5249 | 2 |
1.3013 | 1.3417 | 3 |
1.0752 | 1.2465 | 4 |
0.9119 | 1.1651 | 5 |
0.7778 | 1.1213 | 6 |
0.6763 | 1.0813 | 7 |
0.5907 | 1.0542 | 8 |
0.5162 | 1.0289 | 9 |
0.4573 | 1.0265 | 10 |
0.4057 | 1.0115 | 11 |
0.3645 | 1.0096 | 12 |
0.3227 | 1.0037 | 13 |
0.2864 | 1.0016 | 14 |
0.2598 | 1.0121 | 15 |
0.2291 | 1.0079 | 16 |
0.2069 | 1.0199 | 17 |
0.1876 | 1.0247 | 18 |
0.1717 | 1.0199 | 19 |
0.1544 | 1.0283 | 20 |
0.1393 | 1.0416 | 21 |
0.1285 | 1.0370 | 22 |
0.1171 | 1.0430 | 23 |
0.1069 | 1.0593 | 24 |
0.0990 | 1.0670 | 25 |
0.0915 | 1.0655 | 26 |
0.0827 | 1.0818 | 27 |
0.0781 | 1.0903 | 28 |
0.0729 | 1.0998 | 29 |
0.0678 | 1.0932 | 30 |
0.0639 | 1.1051 | 31 |
0.0592 | 1.1125 | 32 |
0.0556 | 1.1240 | 33 |
0.0509 | 1.1177 | 34 |
0.0512 | 1.1355 | 35 |
0.0438 | 1.1405 | 36 |
0.0453 | 1.1322 | 37 |
0.0443 | 1.1419 | 38 |
0.0407 | 1.1419 | 39 |
0.0397 | 1.1495 | 40 |
0.0386 | 1.1609 | 41 |
0.0346 | 1.1619 | 42 |
0.0351 | 1.1638 | 43 |
0.0344 | 1.1711 | 44 |
0.0302 | 1.1782 | 45 |
0.0470 | 1.1836 | 46 |
0.0330 | 1.1913 | 47 |
0.0284 | 1.1963 | 48 |
0.0268 | 1.1964 | 49 |
0.0255 | 1.2017 | 50 |
0.0236 | 1.2092 | 51 |
0.0241 | 1.2104 | 52 |
0.0234 | 1.2170 | 53 |
0.0216 | 1.2192 | 54 |
0.0209 | 1.2317 | 55 |
0.0205 | 1.2289 | 56 |
0.0193 | 1.2363 | 57 |
0.0191 | 1.2295 | 58 |
0.0184 | 1.2306 | 59 |
0.0185 | 1.2352 | 60 |
0.0184 | 1.2415 | 61 |
0.0174 | 1.2389 | 62 |
0.0166 | 1.2392 | 63 |
0.0167 | 1.2469 | 64 |
0.0166 | 1.2457 | 65 |
0.0147 | 1.2456 | 66 |
0.0146 | 1.2511 | 67 |
0.0147 | 1.2552 | 68 |
0.0147 | 1.2493 | 69 |
0.0133 | 1.2532 | 70 |
0.0135 | 1.2561 | 71 |
0.0136 | 1.2609 | 72 |
0.0130 | 1.2602 | 73 |
0.0119 | 1.2629 | 74 |
0.0123 | 1.2667 | 75 |
0.0114 | 1.2675 | 76 |
0.0122 | 1.2673 | 77 |
0.0111 | 1.2649 | 78 |
0.0099 | 1.2722 | 79 |
0.0109 | 1.2693 | 80 |
0.0101 | 1.2727 | 81 |
0.0101 | 1.2746 | 82 |
0.0096 | 1.2739 | 83 |
0.0103 | 1.2734 | 84 |
0.0096 | 1.2805 | 85 |
0.0093 | 1.2799 | 86 |
0.0097 | 1.2823 | 87 |
0.0093 | 1.2826 | 88 |
0.0095 | 1.2808 | 89 |
0.0091 | 1.2875 | 90 |
0.0081 | 1.2849 | 91 |
0.0084 | 1.2849 | 92 |
0.0083 | 1.2838 | 93 |
0.0089 | 1.2866 | 94 |
0.0084 | 1.2851 | 95 |
0.0082 | 1.2870 | 96 |
0.0078 | 1.2871 | 97 |
0.0078 | 1.2872 | 98 |
0.0077 | 1.2870 | 99 |
Framework versions
- Transformers 4.35.2
- TensorFlow 2.14.0
- Datasets 2.15.0
- Tokenizers 0.15.0