metadata

license: apache-2.0
base_model: t5-small
tags:
  - generated_from_keras_callback
model-index:
  - name: pijarcandra22/NMTIndoBaliT5
    results: []

pijarcandra22/NMTIndoBaliT5

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

Train Loss: 0.2487
Validation Loss: 1.9551
Epoch: 154

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 1e-04, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01}
training_precision: float32

Training results

Train Loss	Validation Loss	Epoch
3.2881	2.6852	0
2.7514	2.4004	1
2.5012	2.2171	2
2.3252	2.0959	3
2.1930	1.9901	4
2.0837	1.9130	5
1.9912	1.8452	6
1.9107	1.7974	7
1.8459	1.7521	8
1.7902	1.7165	9
1.7321	1.6842	10
1.6811	1.6400	11
1.6374	1.6230	12
1.5973	1.5960	13
1.5588	1.5765	14
1.5244	1.5589	15
1.4933	1.5370	16
1.4588	1.5300	17
1.4325	1.5107	18
1.4054	1.4970	19
1.3730	1.4839	20
1.3475	1.4789	21
1.3231	1.4616	22
1.3035	1.4568	23
1.2768	1.4489	24
1.2587	1.4396	25
1.2380	1.4364	26
1.2208	1.4273	27
1.2026	1.4228	28
1.1755	1.4141	29
1.1614	1.4062	30
1.1460	1.4060	31
1.1289	1.3934	32
1.1134	1.4007	33
1.0965	1.3927	34
1.0818	1.3874	35
1.0661	1.3921	36
1.0482	1.3795	37
1.0345	1.3853	38
1.0195	1.3835	39
1.0074	1.3772	40
0.9890	1.3851	41
0.9833	1.3724	42
0.9667	1.3740	43
0.9561	1.3752	44
0.9429	1.3673	45
0.9301	1.3828	46
0.9141	1.3806	47
0.9050	1.3772	48
0.8952	1.3812	49
0.8809	1.3718	50
0.8725	1.3825	51
0.8601	1.3842	52
0.8488	1.3827	53
0.8375	1.3920	54
0.8257	1.3936	55
0.8184	1.3842	56
0.8081	1.3846	57
0.7986	1.3860	58
0.7883	1.3943	59
0.7787	1.4004	60
0.7666	1.4071	61
0.7554	1.4079	62
0.7470	1.4038	63
0.7366	1.4141	64
0.7279	1.4135	65
0.7250	1.4111	66
0.7128	1.4196	67
0.7042	1.4182	68
0.6946	1.4378	69
0.6851	1.4350	70
0.6764	1.4403	71
0.6695	1.4474	72
0.6606	1.4454	73
0.6565	1.4516	74
0.6450	1.4595	75
0.6347	1.4700	76
0.6287	1.4746	77
0.6183	1.4813	78
0.6143	1.4785	79
0.6053	1.4848	80
0.5994	1.4777	81
0.5903	1.4962	82
0.5828	1.5102	83
0.5760	1.4957	84
0.5696	1.5121	85
0.5637	1.5168	86
0.5578	1.5183	87
0.5499	1.5184	88
0.5396	1.5433	89
0.5345	1.5411	90
0.5268	1.5338	91
0.5220	1.5556	92
0.5184	1.5489	93
0.5122	1.5635	94
0.5014	1.5674	95
0.4921	1.5773	96
0.4925	1.5773	97
0.4821	1.5938	98
0.4769	1.6013	99
0.4723	1.5979	100
0.4692	1.6131	101
0.4603	1.6247	102
0.4553	1.6276	103
0.4476	1.6376	104
0.4401	1.6390	105
0.4384	1.6442	106
0.4305	1.6548	107
0.4263	1.6617	108
0.4232	1.6523	109
0.4185	1.6561	110
0.4129	1.6779	111
0.4036	1.6897	112
0.4005	1.6873	113
0.3948	1.6987	114
0.3892	1.7120	115
0.3859	1.7049	116
0.3795	1.7241	117
0.3802	1.7273	118
0.3731	1.7387	119
0.3672	1.7447	120
0.3629	1.7513	121
0.3607	1.7515	122
0.3543	1.7585	123
0.3504	1.7601	124
0.3477	1.7657	125
0.3453	1.7733	126
0.3448	1.7718	127
0.3390	1.7971	128
0.3352	1.7929	129
0.3273	1.7988	130
0.3250	1.8192	131
0.3222	1.8220	132
0.3173	1.8289	133
0.3171	1.8261	134
0.3124	1.8415	135
0.3040	1.8379	136
0.3040	1.8533	137
0.3030	1.8511	138
0.2970	1.8537	139
0.2938	1.8697	140
0.2929	1.8730	141
0.2892	1.8632	142
0.2816	1.8796	143
0.2812	1.8870	144
0.2761	1.8891	145
0.2731	1.9134	146
0.2698	1.9100	147
0.2671	1.9207	148
0.2639	1.9196	149
0.2621	1.9130	150
0.2589	1.9273	151
0.2558	1.9336	152
0.2545	1.9355	153
0.2487	1.9551	154

Framework versions

Transformers 4.38.2
TensorFlow 2.15.0
Datasets 2.18.0
Tokenizers 0.15.2