modernisa-v2-byt5-base-lr0.0001

This model is a fine-tuned version of google/byt5-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4744
Bleu: 30.8745
Wer: 47.8194
Cer: 34.4895
Gen Len: 18.5499

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Wer	Cer	Gen Len
0.2696	0.09	1000	0.3027	27.8571	49.5134	34.4149	18.5
0.2518	0.17	2000	0.2857	29.2213	49.1981	34.6336	18.5371
0.2343	0.26	3000	0.2730	29.5067	49.117	34.9795	18.5537
0.2292	0.35	4000	0.2690	29.884	48.7025	34.8015	18.5516
0.2243	0.44	5000	0.2647	29.9577	48.8466	34.7218	18.5477
0.2112	0.52	6000	0.2636	30.3115	48.3871	34.4895	18.5477
0.2118	0.61	7000	0.2555	30.6364	48.3961	34.7455	18.5413
0.205	0.7	8000	0.2508	31.0881	47.468	34.0759	18.5269
0.2049	0.78	9000	0.2471	31.1481	47.5942	34.4133	18.5503
0.2005	0.87	10000	0.2468	30.9375	47.6392	34.281	18.5405
0.1999	0.96	11000	0.2431	30.9692	47.7023	34.4183	18.5405
0.161	1.04	12000	0.2491	31.2337	47.3238	34.1878	18.5298
0.1601	1.13	13000	0.2496	31.4422	47.3689	34.1657	18.5371
0.1606	1.22	14000	0.2459	31.4582	47.3329	34.2386	18.5405
0.1594	1.31	15000	0.2466	31.386	47.1166	34.2912	18.5375
0.1617	1.39	16000	0.2412	31.6546	46.8373	34.0149	18.5294
0.1582	1.48	17000	0.2461	31.2924	47.4139	34.2573	18.5503
0.1572	1.57	18000	0.2425	31.1484	47.45	34.3675	18.5499
0.1565	1.65	19000	0.2424	31.6967	46.9724	34.1047	18.5388
0.1585	1.74	20000	0.2382	31.9026	47.0175	34.281	18.558
0.1522	1.83	21000	0.2365	32.1619	46.5219	33.9369	18.5311
0.156	1.92	22000	0.2381	31.7762	46.7922	33.9572	18.5401
0.1538	2.0	23000	0.2402	31.8785	46.8012	34.2319	18.5516
0.1083	2.09	24000	0.2654	31.9905	46.603	34.0098	18.5384
0.1086	2.18	25000	0.2618	31.6257	46.9995	34.2607	18.5409
0.1092	2.26	26000	0.2658	31.4886	47.1436	34.337	18.5422
0.1086	2.35	27000	0.2666	31.8448	46.6751	34.1217	18.5375
0.1098	2.44	28000	0.2659	31.709	46.8913	34.1946	18.5452
0.1117	2.52	29000	0.2649	31.8114	46.8913	34.1708	18.5431
0.1094	2.61	30000	0.2656	31.6955	46.8643	34.1606	18.5375
0.1077	2.7	31000	0.2637	31.5495	46.8823	34.0064	18.5448
0.1088	2.79	32000	0.2669	32.0837	46.612	33.9504	18.5413
0.1087	2.87	33000	0.2646	31.5549	47.0806	34.2149	18.5286
0.1077	2.96	34000	0.2630	32.1129	46.4318	33.9403	18.5452
0.0652	3.05	35000	0.3360	31.3861	47.1977	34.1149	18.5396
0.0662	3.13	36000	0.3401	31.2372	47.3869	34.203	18.552
0.0666	3.22	37000	0.3389	31.3462	47.2968	34.1759	18.5469
0.0648	3.31	38000	0.3339	30.835	47.6753	34.381	18.552
0.0654	3.4	39000	0.3395	31.0958	47.7203	34.4692	18.5524
0.0663	3.48	40000	0.3318	31.126	47.5942	34.4539	18.5499
0.0648	3.57	41000	0.3397	31.0295	47.5852	34.3539	18.5477
0.0635	3.66	42000	0.3414	31.1287	47.5491	34.4285	18.5494
0.0656	3.74	43000	0.3394	30.9225	47.6392	34.4285	18.5563
0.0625	3.83	44000	0.3420	31.2435	47.2968	34.1674	18.5439
0.0636	3.92	45000	0.3448	31.0688	47.6843	34.3743	18.5439
0.0586	4.0	46000	0.3675	31.2353	47.441	34.2963	18.549
0.0298	4.09	47000	0.4566	30.698	47.8555	34.4319	18.5512
0.0301	4.18	48000	0.4724	30.7773	47.8374	34.3861	18.5507
0.0311	4.27	49000	0.4640	31.0878	47.6212	34.3861	18.5503
0.03	4.35	50000	0.4654	30.8319	47.8915	34.459	18.5529
0.0302	4.44	51000	0.4665	30.9236	47.9276	34.4997	18.552
0.029	4.53	52000	0.4757	30.8307	47.9456	34.4997	18.5482
0.0301	4.61	53000	0.4672	30.7983	47.9456	34.5218	18.5473
0.0294	4.7	54000	0.4715	30.8924	47.7564	34.4353	18.5529
0.0288	4.79	55000	0.4752	30.7372	47.7924	34.4675	18.5524
0.0289	4.88	56000	0.4744	30.8554	47.8555	34.459	18.5516
0.0288	4.96	57000	0.4744	30.8745	47.8194	34.4895	18.5499

Framework versions

Transformers 4.30.0.dev0
Pytorch 1.13.0+cu117
Datasets 2.12.0
Tokenizers 0.11.0

versae
/

modernisa-v2-byt5-base-lr0.0001

modernisa-v2-byt5-base-lr0.0001

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for versae/modernisa-v2-byt5-base-lr0.0001

Space using versae/modernisa-v2-byt5-base-lr0.0001 1

Evaluation results