byt5-base-indocollex-informal-to-formal-wordformation

This model is a fine-tuned version of google/byt5-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.1413
Cer: 0.1978
Wer: 0.4524
Word Acc: 0.5476
Gen Len: 7.6457

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Cer	Wer	Word Acc	Gen Len
No log	0.54	50	16.1894	2.1868	2.2905	-1.2905	19.0
No log	1.08	100	13.7479	2.1248	1.9333	-0.9333	19.0
No log	1.61	150	11.6231	2.1095	1.4238	-0.4238	18.7486
No log	2.15	200	8.9106	1.056	0.9857	0.0143	10.6171
No log	2.69	250	4.6844	0.8523	0.9762	0.0238	9.36
No log	3.23	300	4.1175	0.5756	0.9714	0.0286	7.4114
No log	3.76	350	3.3688	0.5951	0.9714	0.0286	7.8
No log	4.3	400	2.2287	0.6112	0.9857	0.0143	6.7543
No log	4.84	450	1.5164	0.6095	0.9571	0.0429	7.8857
8.4834	5.38	500	1.0363	0.5976	0.9476	0.0524	7.8229
8.4834	5.91	550	0.6893	0.5976	0.9476	0.0524	7.7943
8.4834	6.45	600	0.5438	0.5866	0.9381	0.0619	7.9943
8.4834	6.99	650	0.4720	0.5806	0.9333	0.0667	8.0057
8.4834	7.53	700	0.4305	0.5764	0.9333	0.0667	8.0057
8.4834	8.06	750	0.3931	0.5654	0.9333	0.0667	8.2971
8.4834	8.6	800	0.3450	0.4576	0.9952	0.0048	7.7086
8.4834	9.14	850	0.2773	0.3226	0.8238	0.1762	7.8743
8.4834	9.68	900	0.2184	0.2368	0.7286	0.2714	7.2171
8.4834	10.22	950	0.1992	0.2165	0.6333	0.3667	7.4343
0.7362	10.75	1000	0.1887	0.2097	0.5714	0.4286	7.5829
0.7362	11.29	1050	0.1815	0.2216	0.5905	0.4095	7.6171
0.7362	11.83	1100	0.1688	0.2046	0.5762	0.4238	7.4629
0.7362	12.37	1150	0.1679	0.2012	0.5286	0.4714	7.7143
0.7362	12.9	1200	0.1579	0.1952	0.5333	0.4667	7.5257
0.7362	13.44	1250	0.1531	0.1969	0.5095	0.4905	7.5714
0.7362	13.98	1300	0.1484	0.1935	0.4952	0.5048	7.5543
0.7362	14.52	1350	0.1481	0.1969	0.4952	0.5048	7.5886
0.7362	15.05	1400	0.1417	0.191	0.481	0.519	7.5829
0.7362	15.59	1450	0.1429	0.1876	0.4762	0.5238	7.5829
0.195	16.13	1500	0.1407	0.1834	0.481	0.519	7.48
0.195	16.67	1550	0.1409	0.1995	0.481	0.519	7.7086
0.195	17.2	1600	0.1432	0.1817	0.4762	0.5238	7.4857
0.195	17.74	1650	0.1439	0.1885	0.4762	0.5238	7.5429
0.195	18.28	1700	0.1385	0.1766	0.4476	0.5524	7.5143
0.195	18.82	1750	0.1357	0.1834	0.4762	0.5238	7.4971
0.195	19.35	1800	0.1349	0.1935	0.4714	0.5286	7.4686
0.195	19.89	1850	0.1355	0.1842	0.4286	0.5714	7.5371
0.195	20.43	1900	0.1343	0.1902	0.4619	0.5381	7.5714
0.195	20.97	1950	0.1348	0.1808	0.4619	0.5381	7.4229
0.1287	21.51	2000	0.1341	0.1817	0.4524	0.5476	7.4571
0.1287	22.04	2050	0.1324	0.1868	0.4476	0.5524	7.5371
0.1287	22.58	2100	0.1329	0.1859	0.4571	0.5429	7.4571
0.1287	23.12	2150	0.1367	0.1868	0.4476	0.5524	7.56
0.1287	23.66	2200	0.1389	0.1919	0.4667	0.5333	7.48
0.1287	24.19	2250	0.1385	0.18	0.4333	0.5667	7.5029
0.1287	24.73	2300	0.1429	0.1944	0.4905	0.5095	7.4171
0.1287	25.27	2350	0.1414	0.1961	0.4667	0.5333	7.6057
0.1287	25.81	2400	0.1419	0.1876	0.4333	0.5667	7.5371
0.1287	26.34	2450	0.1433	0.1927	0.4667	0.5333	7.5886
0.0977	26.88	2500	0.1433	0.1927	0.4571	0.5429	7.5486
0.0977	27.42	2550	0.1413	0.1978	0.4524	0.5476	7.6457

Framework versions

Transformers 4.33.0
Pytorch 2.0.0
Datasets 2.1.0
Tokenizers 0.13.3

syafiqfaray
/

byt5-base-indocollex-informal-to-formal-wordformation

byt5-base-indocollex-informal-to-formal-wordformation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for syafiqfaray/byt5-base-indocollex-informal-to-formal-wordformation

Evaluation results