cs_m2m_0.0001_100_v0.2

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 8.4496
Bleu: 0.0928
Gen Len: 62.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
3.1218	1.0	6	8.4336	0.0372	115.8571
1.7719	2.0	12	8.4226	0.0454	83.1429
2.2391	3.0	18	8.3857	0.0595	67.8571
3.3595	4.0	24	8.3587	0.117	59.1429
3.2809	5.0	30	8.3475	0.0806	70.4286
2.5704	6.0	36	8.3259	0.1683	69.8095
3.8725	7.0	42	8.3405	0.0339	109.9048
2.9887	8.0	48	8.3686	0.0447	91.1905
2.9363	9.0	54	8.3856	0.0547	80.5238
2.3718	10.0	60	8.3621	0.0594	66.619
2.977	11.0	66	8.3563	0.0356	107.1905
2.4379	12.0	72	8.3682	0.0266	150.619
1.9983	13.0	78	8.3733	0.0655	96.619
2.5183	14.0	84	8.3767	0.0417	92.1905
4.7446	15.0	90	8.3677	0.0457	81.1429
2.8195	16.0	96	8.3779	0.0467	81.381
3.1357	17.0	102	8.3751	0.0531	123.4762
3.1353	18.0	108	8.3707	0.1118	83.4286
2.2632	19.0	114	8.3813	0.1173	80.0476
1.7457	20.0	120	8.3786	0.1014	100.6667
1.991	21.0	126	8.3845	0.0937	60.381
3.1272	22.0	132	8.3823	0.0648	75.0
2.5017	23.0	138	8.3882	0.1951	41.7619
3.1988	24.0	144	8.3901	0.2921	17.381
2.0247	25.0	150	8.3950	0.0929	50.8095
2.8855	26.0	156	8.4009	0.1452	37.8095
1.8024	27.0	162	8.3844	0.0439	95.2381
4.727	28.0	168	8.3750	0.0352	106.8571
2.3243	29.0	174	8.3736	0.0344	123.619
2.4946	30.0	180	8.3908	0.1952	112.4286
3.2337	31.0	186	8.3960	0.2593	58.9048
3.1065	32.0	192	8.3937	0.3752	48.0952
3.3689	33.0	198	8.3855	0.3984	48.8571
2.51	34.0	204	8.3928	0.2597	53.7143
1.5195	35.0	210	8.3917	0.1361	74.7143
2.1133	36.0	216	8.3964	0.0702	78.4286
2.6349	37.0	222	8.3839	0.0477	103.4286
2.2733	38.0	228	8.3770	0.0746	77.381
3.0805	39.0	234	8.3773	0.1324	75.3333
3.1701	40.0	240	8.3853	0.0776	75.8571
2.5676	41.0	246	8.3988	0.1274	76.7619
5.1543	42.0	252	8.4117	0.0381	110.2857
2.4138	43.0	258	8.4101	0.0472	92.619
2.6	44.0	264	8.3991	0.0422	102.0
5.2608	45.0	270	8.3912	0.0602	84.4762
2.6492	46.0	276	8.3918	0.0667	80.6667
2.5329	47.0	282	8.3901	0.1159	42.2857
2.894	48.0	288	8.3936	0.1352	46.381
2.6136	49.0	294	8.3959	0.1059	45.4286
3.2249	50.0	300	8.3954	0.246	46.1429
2.8511	51.0	306	8.3923	0.1572	52.8571
2.7592	52.0	312	8.3875	0.1112	62.1429
2.37	53.0	318	8.3839	0.0926	67.3333
3.1555	54.0	324	8.3989	0.0855	71.2381
2.723	55.0	330	8.4030	0.0756	78.4286
2.498	56.0	336	8.4131	0.3874	74.9048
2.6088	57.0	342	8.4278	0.118	83.7143
2.1392	58.0	348	8.4388	0.3423	80.381
2.8988	59.0	354	8.4506	0.0844	73.9048
2.2013	60.0	360	8.4596	0.0892	70.1429
2.2335	61.0	366	8.4694	0.1165	59.4762
3.306	62.0	372	8.4838	0.1685	49.4762
3.0362	63.0	378	8.4894	0.1189	56.1905
3.0111	64.0	384	8.4909	0.0926	66.5714
2.802	65.0	390	8.4956	0.0906	66.0
2.4222	66.0	396	8.4917	0.0742	72.381
2.8748	67.0	402	8.4870	0.0704	76.0952
2.7946	68.0	408	8.4823	0.0572	84.2381
2.7195	69.0	414	8.4714	0.0573	84.2381
2.487	70.0	420	8.4640	0.0578	83.3333
1.5811	71.0	426	8.4632	0.0516	91.381
2.7705	72.0	432	8.4618	0.0597	80.619
2.3703	73.0	438	8.4622	0.0598	80.619
2.4037	74.0	444	8.4618	0.0906	66.2381
2.3173	75.0	450	8.4579	0.0926	63.381
1.8697	76.0	456	8.4564	0.0942	62.5238
1.8887	77.0	462	8.4554	0.0979	62.6667
3.84	78.0	468	8.4590	0.077	70.1429
2.388	79.0	474	8.4654	0.0735	71.2381
2.591	80.0	480	8.4685	0.075	70.9048
2.7345	81.0	486	8.4665	0.0791	52.5238
2.7887	82.0	492	8.4669	0.0759	70.2381
2.5452	83.0	498	8.4675	0.0764	70.8095
2.7554	84.0	504	8.4693	0.096	53.9524
4.2388	85.0	510	8.4656	0.0939	62.8571
2.361	86.0	516	8.4612	0.0923	63.9524
1.912	87.0	522	8.4569	0.0916	62.5714
2.2787	88.0	528	8.4524	0.0942	63.2857
1.9425	89.0	534	8.4530	0.0942	62.0952
2.7257	90.0	540	8.4545	0.0967	61.381
1.9149	91.0	546	8.4552	0.0959	61.8095
2.507	92.0	552	8.4546	0.0936	63.1429
2.8124	93.0	558	8.4547	0.0947	63.2857
2.3852	94.0	564	8.4527	0.0955	62.8571
1.7975	95.0	570	8.4528	0.0947	63.2857
4.9651	96.0	576	8.4517	0.0922	62.4286
2.1141	97.0	582	8.4510	0.0928	62.0
2.6156	98.0	588	8.4502	0.0928	62.0
1.987	99.0	594	8.4498	0.0928	62.0
2.5299	100.0	600	8.4496	0.0928	62.0

Framework versions

Transformers 4.35.2
Pytorch 1.13.1+cu117
Datasets 2.16.1
Tokenizers 0.15.0

kmok1
/

cs_m2m_0.0001_100_v0.2

cs_m2m_0.0001_100_v0.2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for kmok1/cs_m2m_0.0001_100_v0.2

Evaluation results