cs_mT5_0.01_100_v0.1

This model is a fine-tuned version of google/mt5-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 7.2340
Bleu: 0.3036
Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
6.6837	1.0	6	11.3036	0.2013	19.0
8.0402	2.0	12	7.6042	0.0	19.0
5.7261	3.0	18	6.2715	0.1442	19.0
5.3416	4.0	24	6.1049	0.0	19.0
5.8512	5.0	30	5.9636	0.7212	19.0
5.5121	6.0	36	6.1024	0.0	19.0
5.4593	7.0	42	5.8880	0.0	19.0
5.9207	8.0	48	5.7795	0.1689	19.0
4.8391	9.0	54	5.8944	0.7212	19.0
5.0744	10.0	60	5.8163	0.7212	19.0
4.3596	11.0	66	5.6633	0.7212	19.0
4.4986	12.0	72	5.7046	0.7212	19.0
3.7398	13.0	78	5.7036	0.1689	19.0
4.3772	14.0	84	5.6193	0.7212	19.0
4.3643	15.0	90	5.6598	0.7212	19.0
4.1574	16.0	96	5.7247	0.7212	19.0
4.1304	17.0	102	5.7906	0.7212	19.0
4.1503	18.0	108	5.6421	0.1689	19.0
4.769	19.0	114	5.5631	0.7212	19.0
4.7648	20.0	120	5.9913	0.0	19.0
4.076	21.0	126	5.8300	0.7212	19.0
4.5435	22.0	132	5.7988	0.7212	19.0
4.2224	23.0	138	5.7900	0.7212	19.0
3.7953	24.0	144	6.0687	0.011	19.0
4.0312	25.0	150	5.8321	0.1689	19.0
3.4781	26.0	156	5.8820	0.7984	19.0
4.0509	27.0	162	5.9177	0.7212	19.0
3.8217	28.0	168	5.7663	0.7861	19.0
4.1972	29.0	174	6.0547	0.9173	19.0
3.9588	30.0	180	5.7790	0.7212	19.0
3.8624	31.0	186	5.8604	0.1916	19.0
3.7053	32.0	192	5.9171	0.7212	19.0
4.03	33.0	198	5.8490	0.7212	19.0
3.3214	34.0	204	6.3967	0.7212	19.0
3.8343	35.0	210	5.7936	0.0	19.0
3.3124	36.0	216	5.8793	0.7663	19.0
3.7071	37.0	222	6.1326	0.0957	8.0
3.6547	38.0	228	5.9072	0.8029	19.0
3.4187	39.0	234	5.8807	0.5047	19.0
3.953	40.0	240	5.8663	0.7923	19.0
4.0113	41.0	246	6.1256	0.7212	19.0
4.2969	42.0	252	6.0113	0.1689	19.0
3.9081	43.0	258	5.9222	0.0	15.9048
3.7646	44.0	264	5.9990	0.7212	19.0
3.5407	45.0	270	6.2920	0.0945	7.0
2.8075	46.0	276	6.1092	0.4815	19.0
3.9057	47.0	282	6.1175	1.0006	19.0
4.1845	48.0	288	6.2553	0.8147	19.0
3.4686	49.0	294	6.1979	0.7796	19.0
3.029	50.0	300	6.1064	0.7771	19.0
3.62	51.0	306	5.9443	0.7212	19.0
3.719	52.0	312	6.3162	0.7212	19.0
3.4713	53.0	318	5.9465	0.7212	19.0
3.675	54.0	324	6.1606	0.3501	19.0
3.518	55.0	330	6.1223	0.1689	19.0
3.3729	56.0	336	6.0394	1.3618	19.0
2.7827	57.0	342	6.3169	0.7212	19.0
3.7061	58.0	348	6.4504	1.694	19.0
3.4929	59.0	354	6.3042	0.7475	19.0
2.1424	60.0	360	6.3536	0.8628	19.0
2.787	61.0	366	6.3339	0.0	19.0
3.6486	62.0	372	6.4380	0.1023	19.0
3.8631	63.0	378	6.3261	0.7212	19.0
3.4476	64.0	384	6.2478	1.2825	19.0
3.256	65.0	390	6.4766	0.5017	19.0
3.6114	66.0	396	6.4519	0.7212	19.0
3.8405	67.0	402	6.3538	0.4744	19.0
3.3164	68.0	408	6.0134	0.3725	19.0
3.4129	69.0	414	6.5988	0.2135	19.0
3.693	70.0	420	6.4498	0.1689	19.0
2.9521	71.0	426	6.2916	1.3636	19.0
3.6362	72.0	432	6.3040	0.3063	19.0
3.6713	73.0	438	6.3731	0.8106	19.0
3.2562	74.0	444	6.3822	0.9407	19.0
2.4132	75.0	450	6.5435	0.9407	19.0
3.4504	76.0	456	6.7828	0.8829	19.0
3.282	77.0	462	6.6479	1.4788	19.0
3.4199	78.0	468	6.6536	0.0761	6.0
3.4234	79.0	474	6.5193	0.4172	19.0
3.0937	80.0	480	6.7476	0.5603	19.0
2.9563	81.0	486	6.6885	1.5178	19.0
3.1052	82.0	492	6.6320	1.3064	19.0
2.7674	83.0	498	6.6363	0.7892	19.0
2.6265	84.0	504	6.6629	1.5199	19.0
2.3116	85.0	510	6.6467	0.0	19.0
3.0439	86.0	516	6.7820	0.9326	19.0
2.7406	87.0	522	6.9067	1.2025	19.0
2.4509	88.0	528	6.9738	1.0657	19.0
2.8186	89.0	534	7.1507	0.4574	19.0
2.6713	90.0	540	7.0799	0.4527	19.0
2.6231	91.0	546	7.0459	0.646	19.0
3.2357	92.0	552	7.0238	0.525	19.0
2.8834	93.0	558	7.0185	0.5206	19.0
1.7973	94.0	564	7.0711	0.8153	19.0
1.9995	95.0	570	7.1263	0.3015	19.0
2.2875	96.0	576	7.1877	0.3025	19.0
1.8547	97.0	582	7.2062	0.3025	19.0
1.5572	98.0	588	7.2270	0.5076	19.0
1.7653	99.0	594	7.2347	0.3025	19.0
2.6411	100.0	600	7.2340	0.3036	19.0

Framework versions

Transformers 4.35.2
Pytorch 1.13.1+cu117
Datasets 2.16.1
Tokenizers 0.15.0

kmok1
/

cs_mT5_0.01_100_v0.1

cs_mT5_0.01_100_v0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from

Evaluation results

cs_mT5_0.01_100_v0.1

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from google/mt5-base

Evaluation results

Finetuned from