mt5-small-finetuned-19jan-7

This model is a fine-tuned version of google/mt5-small on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.6123
Rouge1: 6.8298
Rouge2: 0.1667
Rougel: 6.5947
Rougelsum: 6.6685

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 12
eval_batch_size: 12
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 60

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
16.2953	1.0	50	5.4420	2.3065	0.0	2.3217	2.3089
10.6895	2.0	100	4.4691	3.2975	0.3693	3.2976	3.3376
7.0377	3.0	150	3.2638	4.1896	0.3485	4.1487	4.1878
5.7221	4.0	200	3.0772	6.2012	0.7955	6.1846	6.3083
4.9356	5.0	250	3.0312	5.2032	0.8545	5.1829	5.2263
4.4656	6.0	300	3.0022	5.6901	1.3505	5.6184	5.6791
4.2279	7.0	350	2.9585	5.6907	1.5424	5.644	5.7768
4.0578	8.0	400	2.9098	5.7425	1.0202	5.6452	5.7881
3.9236	9.0	450	2.8686	6.2001	1.1793	6.1891	6.2508
3.8237	10.0	500	2.8222	5.9182	1.1793	5.8436	5.9807
3.7078	11.0	550	2.7890	5.4733	1.3896	5.3702	5.4957
3.641	12.0	600	2.7522	5.8312	1.1793	5.784	5.9037
3.5527	13.0	650	2.7168	6.3129	1.1793	6.2924	6.384
3.5281	14.0	700	2.7000	9.1787	0.8333	9.1491	9.2241
3.4547	15.0	750	2.6966	7.8778	0.3333	7.8306	7.9167
3.4386	16.0	800	2.6892	8.3907	0.3333	8.3167	8.4
3.3749	17.0	850	2.6786	8.6167	0.4167	8.5917	8.5787
3.3681	18.0	900	2.6895	8.2466	0.4167	8.1799	8.2407
3.3173	19.0	950	2.6957	8.1742	0.4167	8.1197	8.1429
3.3034	20.0	1000	2.6721	8.2466	0.4167	8.1799	8.2407
3.2594	21.0	1050	2.6698	8.569	0.4167	8.5419	8.619
3.2138	22.0	1100	2.6676	8.2722	0.4167	8.2343	8.3037
3.2239	23.0	1150	2.6537	8.1444	0.4167	8.1051	8.1301
3.1887	24.0	1200	2.6529	8.1444	0.4167	8.1051	8.1301
3.1641	25.0	1250	2.6685	7.7777	0.1667	7.7204	7.8143
3.162	26.0	1300	2.6619	8.3776	0.3333	8.4135	8.4692
3.1114	27.0	1350	2.6632	8.3776	0.3333	8.4135	8.4692
3.0645	28.0	1400	2.6438	7.8811	0.3333	7.8333	7.9484
3.0984	29.0	1450	2.6384	7.3936	0.1667	7.3609	7.4051
3.0712	30.0	1500	2.6389	6.9609	0.1667	6.875	7.0253
3.0662	31.0	1550	2.6346	7.95	0.1667	7.9051	8.0218
3.0294	32.0	1600	2.6420	7.3936	0.1667	7.3609	7.4051
3.0143	33.0	1650	2.6325	7.6526	0.1667	7.6869	7.7551
3.002	34.0	1700	2.6384	7.9436	0.1667	7.9317	8.016
2.9964	35.0	1750	2.6262	8.2958	0.4167	8.2317	8.3936
2.9893	36.0	1800	2.6351	8.6535	0.1667	8.616	8.7333
2.9862	37.0	1850	2.6320	8.2452	0.1667	8.2	8.3218
2.9588	38.0	1900	2.6214	7.6656	0.1667	7.6819	7.7
2.9697	39.0	1950	2.6229	7.1452	0.1667	7.1051	7.1942
2.9433	40.0	2000	2.6209	7.5775	0.4167	7.4893	7.5833
2.9306	41.0	2050	2.6197	7.525	0.4167	7.4435	7.5351
2.9382	42.0	2100	2.6190	7.525	0.4167	7.4435	7.5351
2.9269	43.0	2150	2.6234	7.3614	0.4167	7.2092	7.3592
2.9152	44.0	2200	2.6237	6.9976	0.1667	6.8777	7.0333
2.9137	45.0	2250	2.6213	6.9976	0.1667	6.8777	7.0333
2.9011	46.0	2300	2.6212	6.9976	0.1667	6.8777	7.0333
2.8941	47.0	2350	2.6188	6.7768	0.1667	6.6509	6.812
2.9143	48.0	2400	2.6126	7.0875	0.1667	6.803	6.9337
2.8798	49.0	2450	2.6207	6.4458	0.1667	6.3221	6.4527
2.8701	50.0	2500	2.6172	6.7542	0.1667	6.4857	6.5729
2.8823	51.0	2550	2.6161	6.9971	0.1667	6.6819	6.7968
2.8724	52.0	2600	2.6171	6.8298	0.1667	6.5947	6.6685
2.8635	53.0	2650	2.6176	6.8298	0.1667	6.5947	6.6685
2.8803	54.0	2700	2.6134	6.1417	0.1667	5.929	6.0423
2.8608	55.0	2750	2.6118	6.4953	0.1667	6.2113	6.3554
2.8655	56.0	2800	2.6125	6.4976	0.1667	6.2625	6.3539
2.856	57.0	2850	2.6136	6.8298	0.1667	6.5947	6.6685
2.8837	58.0	2900	2.6124	6.8298	0.1667	6.5947	6.6685
2.8871	59.0	2950	2.6123	6.8298	0.1667	6.5947	6.6685
2.8537	60.0	3000	2.6123	6.8298	0.1667	6.5947	6.6685

Framework versions

Transformers 4.25.1
Pytorch 1.13.1+cu116
Datasets 2.8.0
Tokenizers 0.13.2

mqy
/

mt5-small-finetuned-19jan-7

mt5-small-finetuned-19jan-7

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results