mt5-small-finetuned-mt5

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6691
Rouge1: 0.5388
Rouge2: 0.3838
Rougel: 0.5283
Rougelsum: 0.5270

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 20
eval_batch_size: 20
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
12.893	1.0	8	7.2101	0.0967	0.0309	0.0928	0.0928
12.4326	2.0	16	6.0616	0.1183	0.0458	0.1140	0.1141
12.0044	3.0	24	5.5399	0.1239	0.0469	0.1212	0.1200
11.4794	4.0	32	5.2619	0.1504	0.0541	0.1450	0.1470
10.85	5.0	40	4.8356	0.1675	0.0574	0.1605	0.1626
10.2044	6.0	48	4.2656	0.1933	0.0746	0.1862	0.1905
9.2904	7.0	56	3.7518	0.1983	0.0787	0.1891	0.1921
8.7029	8.0	64	3.4376	0.1873	0.0698	0.1797	0.1818
8.3889	9.0	72	3.2085	0.1811	0.0672	0.1738	0.1771
7.5091	10.0	80	3.0059	0.1581	0.0581	0.1557	0.1564
7.2132	11.0	88	2.8329	0.1654	0.0466	0.1623	0.1616
6.796	12.0	96	2.6879	0.1735	0.0486	0.1620	0.1617
6.4213	13.0	104	2.5694	0.1799	0.0482	0.1722	0.1726
5.7867	14.0	112	2.4405	0.1776	0.0497	0.1720	0.1715
5.2668	15.0	120	2.3098	0.1860	0.0521	0.1759	0.1766
5.0803	16.0	128	2.1944	0.2010	0.0677	0.1931	0.1939
4.6867	17.0	136	2.1139	0.2179	0.0811	0.2114	0.2117
4.5557	18.0	144	2.0466	0.2186	0.0805	0.2099	0.2103
4.4414	19.0	152	1.9919	0.2260	0.0916	0.2177	0.2172
4.0867	20.0	160	1.9404	0.2317	0.0976	0.2228	0.2221
3.6814	21.0	168	1.9014	0.2287	0.0921	0.2170	0.2157
3.5426	22.0	176	1.8656	0.2208	0.0862	0.2139	0.2131
3.266	23.0	184	1.8224	0.2348	0.0935	0.2232	0.2224
3.32	24.0	192	1.7907	0.2443	0.1072	0.2355	0.2348
3.1872	25.0	200	1.7459	0.2563	0.1121	0.2421	0.2414
2.9643	26.0	208	1.7043	0.2703	0.1213	0.2598	0.2591
2.8918	27.0	216	1.6654	0.2755	0.1190	0.2633	0.2634
2.7626	28.0	224	1.6199	0.3008	0.1385	0.2870	0.2861
2.8192	29.0	232	1.5712	0.3061	0.1410	0.2948	0.2942
2.5082	30.0	240	1.5405	0.3161	0.1533	0.3073	0.3069
2.564	31.0	248	1.5111	0.3296	0.1662	0.3198	0.3196
2.5577	32.0	256	1.4738	0.3344	0.1745	0.3250	0.3247
2.5199	33.0	264	1.4378	0.3468	0.1829	0.3336	0.3328
2.4798	34.0	272	1.4033	0.3593	0.1969	0.3448	0.3450
2.3208	35.0	280	1.3733	0.3728	0.2146	0.3613	0.3609
2.3704	36.0	288	1.3403	0.3721	0.2175	0.3644	0.3649
2.3199	37.0	296	1.3092	0.3718	0.2147	0.3638	0.3631
2.3046	38.0	304	1.2838	0.3674	0.2141	0.3608	0.3610
2.3183	39.0	312	1.2599	0.3728	0.2202	0.3664	0.3669
2.178	40.0	320	1.2272	0.3826	0.2274	0.3758	0.3749
2.1264	41.0	328	1.1940	0.3923	0.2348	0.3841	0.3835
2.0563	42.0	336	1.1629	0.3972	0.2391	0.3864	0.3865
2.0213	43.0	344	1.1324	0.4082	0.2509	0.3981	0.3980
1.9956	44.0	352	1.1085	0.4158	0.2569	0.4051	0.4054
2.0723	45.0	360	1.0895	0.4186	0.2594	0.4060	0.4061
1.9021	46.0	368	1.0713	0.4316	0.2775	0.4193	0.4194
1.9776	47.0	376	1.0510	0.4362	0.2785	0.4232	0.4237
1.8752	48.0	384	1.0289	0.4371	0.2778	0.4225	0.4230
1.8729	49.0	392	1.0070	0.4386	0.2766	0.4243	0.4245
1.9136	50.0	400	0.9900	0.4368	0.2773	0.4240	0.4232
1.86	51.0	408	0.9765	0.4413	0.2818	0.4291	0.4283
1.8629	52.0	416	0.9670	0.4494	0.2909	0.4386	0.4376
1.8345	53.0	424	0.9554	0.4515	0.2942	0.4402	0.4393
1.7786	54.0	432	0.9430	0.4559	0.2980	0.4439	0.4430
1.7535	55.0	440	0.9284	0.4585	0.3016	0.4480	0.4461
1.788	56.0	448	0.9126	0.4680	0.3096	0.4578	0.4568
1.6512	57.0	456	0.9015	0.4803	0.3201	0.4699	0.4691
1.7463	58.0	464	0.8937	0.4813	0.3194	0.4697	0.4693
1.7705	59.0	472	0.8835	0.4805	0.3192	0.4680	0.4673
1.6796	60.0	480	0.8709	0.4797	0.3168	0.4673	0.4667
1.652	61.0	488	0.8588	0.4811	0.3182	0.4686	0.4684
1.6272	62.0	496	0.8470	0.4812	0.3196	0.4696	0.4690
1.6013	63.0	504	0.8357	0.4910	0.3298	0.4779	0.4781
1.5951	64.0	512	0.8268	0.4948	0.3344	0.4818	0.4822
1.5817	65.0	520	0.8164	0.4896	0.3313	0.4787	0.4777
1.6403	66.0	528	0.8064	0.4983	0.3419	0.4867	0.4862
1.6281	67.0	536	0.7955	0.4992	0.3426	0.4866	0.4866
1.6482	68.0	544	0.7881	0.4990	0.3404	0.4860	0.4860
1.6103	69.0	552	0.7822	0.4997	0.3401	0.4882	0.4872
1.5396	70.0	560	0.7769	0.5023	0.3411	0.4896	0.4890
1.5271	71.0	568	0.7696	0.5040	0.3396	0.4908	0.4899
1.4252	72.0	576	0.7614	0.5128	0.3521	0.4999	0.4994
1.553	73.0	584	0.7541	0.5145	0.3525	0.5017	0.5012
1.5503	74.0	592	0.7475	0.5193	0.3561	0.5052	0.5047
1.4653	75.0	600	0.7415	0.5151	0.3540	0.5020	0.5018
1.5387	76.0	608	0.7355	0.5267	0.3632	0.5126	0.5121
1.5706	77.0	616	0.7292	0.5232	0.3628	0.5101	0.5096
1.4442	78.0	624	0.7229	0.5208	0.3626	0.5086	0.5082
1.4816	79.0	632	0.7173	0.5193	0.3606	0.5070	0.5060
1.5228	80.0	640	0.7119	0.5180	0.3596	0.5057	0.5053
1.4623	81.0	648	0.7077	0.5228	0.3645	0.5104	0.5092
1.4077	82.0	656	0.7025	0.5266	0.3699	0.5164	0.5156
1.4069	83.0	664	0.6977	0.5318	0.3749	0.5212	0.5203
1.4191	84.0	672	0.6934	0.5307	0.3732	0.5200	0.5192
1.4564	85.0	680	0.6898	0.5317	0.3764	0.5213	0.5202
1.4195	86.0	688	0.6872	0.5311	0.3751	0.5203	0.5186
1.422	87.0	696	0.6843	0.5319	0.3762	0.5212	0.5196
1.4821	88.0	704	0.6822	0.5355	0.3812	0.5254	0.5242
1.539	89.0	712	0.6809	0.5349	0.3792	0.5246	0.5234
1.4914	90.0	720	0.6793	0.5341	0.3785	0.5233	0.5221
1.4247	91.0	728	0.6774	0.5349	0.3795	0.5242	0.5229
1.4937	92.0	736	0.6757	0.5350	0.3788	0.5238	0.5226
1.3732	93.0	744	0.6741	0.5362	0.3809	0.5256	0.5243
1.3991	94.0	752	0.6729	0.5362	0.3816	0.5261	0.5249
1.481	95.0	760	0.6716	0.5384	0.3836	0.5280	0.5266
1.3902	96.0	768	0.6707	0.5384	0.3836	0.5280	0.5266
1.5239	97.0	776	0.6700	0.5388	0.3838	0.5283	0.5270
1.4486	98.0	784	0.6695	0.5388	0.3844	0.5290	0.5277
1.3551	99.0	792	0.6692	0.5388	0.3838	0.5283	0.5270
1.4213	100.0	800	0.6691	0.5388	0.3838	0.5283	0.5270

Framework versions

Transformers 4.37.2
Pytorch 2.1.0+cu121
Datasets 2.17.1
Tokenizers 0.15.2

apatidar0
/

mt5-small-finetuned-mt5

mt5-small-finetuned-mt5

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for apatidar0/mt5-small-finetuned-mt5

Evaluation results