mt5-translated-lithuanian-simplifier

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0761
Rouge1: 0.7877
Rouge2: 0.6566
Rougel: 0.7845
Gen Len: 49.2293

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 8

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Gen Len
23.9322	0.1	200	19.1649	0.016	0.0004	0.0146	512.0
2.5416	0.19	400	1.4406	0.035	0.0002	0.0345	51.3394
0.7449	0.29	600	0.7221	0.0021	0.0	0.0021	50.2293
0.4405	0.38	800	0.2164	0.5491	0.3593	0.5367	49.4955
0.177	0.48	1000	0.1672	0.6294	0.4636	0.6209	49.2293
0.1838	0.57	1200	0.1561	0.6214	0.4375	0.613	49.2293
0.1471	0.67	1400	0.1295	0.7071	0.5673	0.6998	49.2293
0.1622	0.77	1600	0.1229	0.6929	0.5402	0.6858	49.2293
0.1255	0.86	1800	0.1192	0.7044	0.5547	0.6978	49.2293
0.1281	0.96	2000	0.1150	0.7169	0.5718	0.7103	49.2293
0.1561	1.05	2200	0.1088	0.7165	0.5688	0.7108	49.2293
0.145	1.15	2400	0.1064	0.7321	0.5921	0.7263	49.2293
0.1207	1.25	2600	0.1030	0.7348	0.5957	0.7291	49.2293
0.1151	1.34	2800	0.1014	0.7289	0.5859	0.7239	49.2293
0.1001	1.44	3000	0.0983	0.7402	0.6003	0.7349	49.2293
0.1354	1.53	3200	0.0963	0.738	0.598	0.7332	49.2293
0.1092	1.63	3400	0.0978	0.7446	0.607	0.7394	49.2293
0.1109	1.72	3600	0.0973	0.7427	0.6034	0.7377	49.2293
0.1083	1.82	3800	0.0950	0.7479	0.6094	0.7432	49.2293
0.1348	1.92	4000	0.0958	0.7498	0.6121	0.745	49.2293
0.1004	2.01	4200	0.0898	0.7539	0.6152	0.7494	49.2293
0.1131	2.11	4400	0.0925	0.753	0.6154	0.7488	49.2293
0.1312	2.2	4600	0.0919	0.755	0.6183	0.7508	49.2293
0.1139	2.3	4800	0.0908	0.756	0.6182	0.7518	49.2293
0.1168	2.39	5000	0.0880	0.7574	0.6202	0.7533	49.2293
0.0793	2.49	5200	0.0897	0.7575	0.6193	0.7531	49.2293
0.0869	2.59	5400	0.0866	0.7605	0.6228	0.7564	49.2293
0.1053	2.68	5600	0.0870	0.7594	0.6203	0.7551	49.2293
0.0889	2.78	5800	0.0893	0.7609	0.6237	0.7568	49.2293
0.0982	2.87	6000	0.0873	0.7637	0.6279	0.7599	49.2293
0.0838	2.97	6200	0.0846	0.7665	0.6309	0.7626	49.2293
0.0829	3.07	6400	0.0844	0.7665	0.6315	0.7629	49.2293
0.068	3.16	6600	0.0836	0.7695	0.6358	0.7658	49.2293
0.0747	3.26	6800	0.0848	0.7675	0.6322	0.7639	49.2293
0.0792	3.35	7000	0.0840	0.7691	0.6342	0.7656	49.2293
0.0739	3.45	7200	0.0820	0.7713	0.6365	0.7676	49.2293
0.0793	3.54	7400	0.0813	0.7723	0.6374	0.7685	49.2293
0.0908	3.64	7600	0.0819	0.7731	0.6388	0.7696	49.2293
0.1125	3.74	7800	0.0811	0.774	0.6402	0.7705	49.2293
0.1231	3.83	8000	0.0805	0.7736	0.6391	0.7699	49.2293
0.0805	3.93	8200	0.0806	0.7736	0.6383	0.7698	49.2293
0.0798	4.02	8400	0.0806	0.7758	0.6413	0.7726	49.2293
0.061	4.12	8600	0.0807	0.7738	0.6391	0.7705	49.2293
0.0636	4.21	8800	0.0810	0.7763	0.6424	0.7731	49.2293
0.0813	4.31	9000	0.0798	0.7765	0.6418	0.7731	49.2293
0.0664	4.41	9200	0.0804	0.7779	0.6441	0.7744	49.2293
0.077	4.5	9400	0.0783	0.7775	0.6432	0.774	49.2293
0.0769	4.6	9600	0.0788	0.7786	0.6446	0.7752	49.2293
0.0874	4.69	9800	0.0796	0.7782	0.6455	0.7749	49.2293
0.0682	4.79	10000	0.0784	0.7783	0.6452	0.7752	49.2293
0.0649	4.89	10200	0.0781	0.7788	0.6453	0.7757	49.2293
0.0594	4.98	10400	0.0791	0.7795	0.6468	0.7762	49.2293
0.1001	5.08	10600	0.0775	0.7794	0.6464	0.7762	49.2293
0.065	5.17	10800	0.0794	0.7794	0.6474	0.7762	49.2293
0.0505	5.27	11000	0.0787	0.7809	0.6481	0.7775	49.2293
0.0904	5.36	11200	0.0772	0.7825	0.6504	0.7793	49.2293
0.0782	5.46	11400	0.0777	0.7835	0.651	0.7803	49.2293
0.0758	5.56	11600	0.0774	0.7823	0.6505	0.7792	49.2293
0.0685	5.65	11800	0.0778	0.7819	0.6498	0.7787	49.2293
0.0664	5.75	12000	0.0774	0.7818	0.6493	0.7786	49.2293
0.0841	5.84	12200	0.0770	0.7848	0.6527	0.7813	49.2293
0.0867	5.94	12400	0.0765	0.7844	0.6522	0.7812	49.2293
0.0572	6.03	12600	0.0772	0.7849	0.6522	0.7816	49.2293
0.0554	6.13	12800	0.0775	0.7844	0.6526	0.7812	49.2293
0.0725	6.23	13000	0.0774	0.7851	0.6534	0.7822	49.2293
0.0952	6.32	13200	0.0778	0.7848	0.6527	0.7817	49.2293
0.0795	6.42	13400	0.0764	0.7858	0.6542	0.7826	49.2293
0.0682	6.51	13600	0.0772	0.7852	0.6527	0.7819	49.2293
0.0483	6.61	13800	0.0777	0.785	0.6525	0.7815	49.2293
0.0725	6.7	14000	0.0767	0.7864	0.6545	0.7831	49.2293
0.0675	6.8	14200	0.0773	0.786	0.6551	0.7827	49.2293
0.0706	6.9	14400	0.0758	0.7867	0.6556	0.7837	49.2293
0.0785	6.99	14600	0.0772	0.7866	0.6559	0.7835	49.2293
0.0796	7.09	14800	0.0763	0.7872	0.6564	0.7841	49.2293
0.0761	7.18	15000	0.0757	0.7879	0.6566	0.7848	49.2293
0.0598	7.28	15200	0.0758	0.788	0.6568	0.7849	49.2293
0.0587	7.38	15400	0.0768	0.7872	0.6556	0.7839	49.2293
0.0859	7.47	15600	0.0765	0.7875	0.6559	0.7842	49.2293
0.061	7.57	15800	0.0764	0.7876	0.6564	0.7845	49.2293
0.0718	7.66	16000	0.0764	0.7871	0.6558	0.784	49.2293
0.0695	7.76	16200	0.0763	0.7873	0.656	0.7842	49.2293
0.0678	7.85	16400	0.0762	0.7875	0.6565	0.7844	49.2293
0.0751	7.95	16600	0.0761	0.7877	0.6566	0.7845	49.2293

Framework versions

Transformers 4.36.2
Pytorch 2.1.1
Datasets 2.16.1
Tokenizers 0.15.0

eglkan1
/

mt5-translated-lithuanian-simplifier

mt5-translated-lithuanian-simplifier

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for eglkan1/mt5-translated-lithuanian-simplifier

Evaluation results