NMTIndoBaliT5 / README.md
pijarcandra22's picture
Training in progress epoch 214
528d0bb
|
raw
history blame
10.1 kB
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_keras_callback
model-index:
  - name: pijarcandra22/NMTIndoBaliT5
    results: []

pijarcandra22/NMTIndoBaliT5

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.1414
  • Validation Loss: 2.2294
  • Epoch: 214

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 1e-04, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Validation Loss Epoch
3.2881 2.6852 0
2.7514 2.4004 1
2.5012 2.2171 2
2.3252 2.0959 3
2.1930 1.9901 4
2.0837 1.9130 5
1.9912 1.8452 6
1.9107 1.7974 7
1.8459 1.7521 8
1.7902 1.7165 9
1.7321 1.6842 10
1.6811 1.6400 11
1.6374 1.6230 12
1.5973 1.5960 13
1.5588 1.5765 14
1.5244 1.5589 15
1.4933 1.5370 16
1.4588 1.5300 17
1.4325 1.5107 18
1.4054 1.4970 19
1.3730 1.4839 20
1.3475 1.4789 21
1.3231 1.4616 22
1.3035 1.4568 23
1.2768 1.4489 24
1.2587 1.4396 25
1.2380 1.4364 26
1.2208 1.4273 27
1.2026 1.4228 28
1.1755 1.4141 29
1.1614 1.4062 30
1.1460 1.4060 31
1.1289 1.3934 32
1.1134 1.4007 33
1.0965 1.3927 34
1.0818 1.3874 35
1.0661 1.3921 36
1.0482 1.3795 37
1.0345 1.3853 38
1.0195 1.3835 39
1.0074 1.3772 40
0.9890 1.3851 41
0.9833 1.3724 42
0.9667 1.3740 43
0.9561 1.3752 44
0.9429 1.3673 45
0.9301 1.3828 46
0.9141 1.3806 47
0.9050 1.3772 48
0.8952 1.3812 49
0.8809 1.3718 50
0.8725 1.3825 51
0.8601 1.3842 52
0.8488 1.3827 53
0.8375 1.3920 54
0.8257 1.3936 55
0.8184 1.3842 56
0.8081 1.3846 57
0.7986 1.3860 58
0.7883 1.3943 59
0.7787 1.4004 60
0.7666 1.4071 61
0.7554 1.4079 62
0.7470 1.4038 63
0.7366 1.4141 64
0.7279 1.4135 65
0.7250 1.4111 66
0.7128 1.4196 67
0.7042 1.4182 68
0.6946 1.4378 69
0.6851 1.4350 70
0.6764 1.4403 71
0.6695 1.4474 72
0.6606 1.4454 73
0.6565 1.4516 74
0.6450 1.4595 75
0.6347 1.4700 76
0.6287 1.4746 77
0.6183 1.4813 78
0.6143 1.4785 79
0.6053 1.4848 80
0.5994 1.4777 81
0.5903 1.4962 82
0.5828 1.5102 83
0.5760 1.4957 84
0.5696 1.5121 85
0.5637 1.5168 86
0.5578 1.5183 87
0.5499 1.5184 88
0.5396 1.5433 89
0.5345 1.5411 90
0.5268 1.5338 91
0.5220 1.5556 92
0.5184 1.5489 93
0.5122 1.5635 94
0.5014 1.5674 95
0.4921 1.5773 96
0.4925 1.5773 97
0.4821 1.5938 98
0.4769 1.6013 99
0.4723 1.5979 100
0.4692 1.6131 101
0.4603 1.6247 102
0.4553 1.6276 103
0.4476 1.6376 104
0.4401 1.6390 105
0.4384 1.6442 106
0.4305 1.6548 107
0.4263 1.6617 108
0.4232 1.6523 109
0.4185 1.6561 110
0.4129 1.6779 111
0.4036 1.6897 112
0.4005 1.6873 113
0.3948 1.6987 114
0.3892 1.7120 115
0.3859 1.7049 116
0.3795 1.7241 117
0.3802 1.7273 118
0.3731 1.7387 119
0.3672 1.7447 120
0.3629 1.7513 121
0.3607 1.7515 122
0.3543 1.7585 123
0.3504 1.7601 124
0.3477 1.7657 125
0.3453 1.7733 126
0.3448 1.7718 127
0.3390 1.7971 128
0.3352 1.7929 129
0.3273 1.7988 130
0.3250 1.8192 131
0.3222 1.8220 132
0.3173 1.8289 133
0.3171 1.8261 134
0.3124 1.8415 135
0.3040 1.8379 136
0.3040 1.8533 137
0.3030 1.8511 138
0.2970 1.8537 139
0.2938 1.8697 140
0.2929 1.8730 141
0.2892 1.8632 142
0.2816 1.8796 143
0.2812 1.8870 144
0.2761 1.8891 145
0.2731 1.9134 146
0.2698 1.9100 147
0.2671 1.9207 148
0.2639 1.9196 149
0.2621 1.9130 150
0.2589 1.9273 151
0.2558 1.9336 152
0.2545 1.9355 153
0.2487 1.9551 154
0.2493 1.9573 155
0.2449 1.9552 156
0.2421 1.9591 157
0.2405 1.9556 158
0.2367 1.9807 159
0.2342 1.9859 160
0.2316 1.9803 161
0.2281 1.9853 162
0.2269 1.9970 163
0.2250 2.0120 164
0.2236 2.0107 165
0.2194 2.0208 166
0.2183 2.0198 167
0.2168 2.0265 168
0.2172 2.0278 169
0.2117 2.0380 170
0.2078 2.0448 171
0.2091 2.0415 172
0.2065 2.0459 173
0.2027 2.0597 174
0.1995 2.0659 175
0.1980 2.0811 176
0.1971 2.0704 177
0.1932 2.0785 178
0.1892 2.0783 179
0.1924 2.0742 180
0.1872 2.0979 181
0.1858 2.0958 182
0.1853 2.1005 183
0.1834 2.1166 184
0.1810 2.1027 185
0.1789 2.1151 186
0.1768 2.1302 187
0.1768 2.1200 188
0.1766 2.1399 189
0.1732 2.1196 190
0.1719 2.1362 191
0.1697 2.1447 192
0.1684 2.1464 193
0.1699 2.1442 194
0.1657 2.1492 195
0.1607 2.1644 196
0.1603 2.1667 197
0.1580 2.1715 198
0.1588 2.1818 199
0.1551 2.1825 200
0.1572 2.1779 201
0.1552 2.1842 202
0.1528 2.2038 203
0.1530 2.1941 204
0.1501 2.1903 205
0.1492 2.2089 206
0.1498 2.1871 207
0.1481 2.1888 208
0.1486 2.2130 209
0.1434 2.2259 210
0.1432 2.2159 211
0.1436 2.2151 212
0.1411 2.2221 213
0.1414 2.2294 214

Framework versions

  • Transformers 4.38.2
  • TensorFlow 2.15.0
  • Datasets 2.18.0
  • Tokenizers 0.15.2