NMTIndoBaliT5 / README.md
pijarcandra22's picture
Training in progress epoch 154
2cdd5b6
|
raw
history blame
7.63 kB
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_keras_callback
model-index:
  - name: pijarcandra22/NMTIndoBaliT5
    results: []

pijarcandra22/NMTIndoBaliT5

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.2487
  • Validation Loss: 1.9551
  • Epoch: 154

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 1e-04, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
3.2881 2.6852 0
2.7514 2.4004 1
2.5012 2.2171 2
2.3252 2.0959 3
2.1930 1.9901 4
2.0837 1.9130 5
1.9912 1.8452 6
1.9107 1.7974 7
1.8459 1.7521 8
1.7902 1.7165 9
1.7321 1.6842 10
1.6811 1.6400 11
1.6374 1.6230 12
1.5973 1.5960 13
1.5588 1.5765 14
1.5244 1.5589 15
1.4933 1.5370 16
1.4588 1.5300 17
1.4325 1.5107 18
1.4054 1.4970 19
1.3730 1.4839 20
1.3475 1.4789 21
1.3231 1.4616 22
1.3035 1.4568 23
1.2768 1.4489 24
1.2587 1.4396 25
1.2380 1.4364 26
1.2208 1.4273 27
1.2026 1.4228 28
1.1755 1.4141 29
1.1614 1.4062 30
1.1460 1.4060 31
1.1289 1.3934 32
1.1134 1.4007 33
1.0965 1.3927 34
1.0818 1.3874 35
1.0661 1.3921 36
1.0482 1.3795 37
1.0345 1.3853 38
1.0195 1.3835 39
1.0074 1.3772 40
0.9890 1.3851 41
0.9833 1.3724 42
0.9667 1.3740 43
0.9561 1.3752 44
0.9429 1.3673 45
0.9301 1.3828 46
0.9141 1.3806 47
0.9050 1.3772 48
0.8952 1.3812 49
0.8809 1.3718 50
0.8725 1.3825 51
0.8601 1.3842 52
0.8488 1.3827 53
0.8375 1.3920 54
0.8257 1.3936 55
0.8184 1.3842 56
0.8081 1.3846 57
0.7986 1.3860 58
0.7883 1.3943 59
0.7787 1.4004 60
0.7666 1.4071 61
0.7554 1.4079 62
0.7470 1.4038 63
0.7366 1.4141 64
0.7279 1.4135 65
0.7250 1.4111 66
0.7128 1.4196 67
0.7042 1.4182 68
0.6946 1.4378 69
0.6851 1.4350 70
0.6764 1.4403 71
0.6695 1.4474 72
0.6606 1.4454 73
0.6565 1.4516 74
0.6450 1.4595 75
0.6347 1.4700 76
0.6287 1.4746 77
0.6183 1.4813 78
0.6143 1.4785 79
0.6053 1.4848 80
0.5994 1.4777 81
0.5903 1.4962 82
0.5828 1.5102 83
0.5760 1.4957 84
0.5696 1.5121 85
0.5637 1.5168 86
0.5578 1.5183 87
0.5499 1.5184 88
0.5396 1.5433 89
0.5345 1.5411 90
0.5268 1.5338 91
0.5220 1.5556 92
0.5184 1.5489 93
0.5122 1.5635 94
0.5014 1.5674 95
0.4921 1.5773 96
0.4925 1.5773 97
0.4821 1.5938 98
0.4769 1.6013 99
0.4723 1.5979 100
0.4692 1.6131 101
0.4603 1.6247 102
0.4553 1.6276 103
0.4476 1.6376 104
0.4401 1.6390 105
0.4384 1.6442 106
0.4305 1.6548 107
0.4263 1.6617 108
0.4232 1.6523 109
0.4185 1.6561 110
0.4129 1.6779 111
0.4036 1.6897 112
0.4005 1.6873 113
0.3948 1.6987 114
0.3892 1.7120 115
0.3859 1.7049 116
0.3795 1.7241 117
0.3802 1.7273 118
0.3731 1.7387 119
0.3672 1.7447 120
0.3629 1.7513 121
0.3607 1.7515 122
0.3543 1.7585 123
0.3504 1.7601 124
0.3477 1.7657 125
0.3453 1.7733 126
0.3448 1.7718 127
0.3390 1.7971 128
0.3352 1.7929 129
0.3273 1.7988 130
0.3250 1.8192 131
0.3222 1.8220 132
0.3173 1.8289 133
0.3171 1.8261 134
0.3124 1.8415 135
0.3040 1.8379 136
0.3040 1.8533 137
0.3030 1.8511 138
0.2970 1.8537 139
0.2938 1.8697 140
0.2929 1.8730 141
0.2892 1.8632 142
0.2816 1.8796 143
0.2812 1.8870 144
0.2761 1.8891 145
0.2731 1.9134 146
0.2698 1.9100 147
0.2671 1.9207 148
0.2639 1.9196 149
0.2621 1.9130 150
0.2589 1.9273 151
0.2558 1.9336 152
0.2545 1.9355 153
0.2487 1.9551 154

Framework versions

  • Transformers 4.38.2
  • TensorFlow 2.15.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2