whisper-med_15k

This model was trained from scratch on five datasets. It achieves the following results on the evaluation set:

Cer: 6.2657
Cer Mecab: 6.5093
Cer Ortho: 6.2657
Loss: 0.1532
Wer: 10.1273
Wer Ortho: 10.1273

Model description

ADALORA test run

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.000125
train_batch_size: 4
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Adafactor
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.01
training_steps: 10000

Training results

Training Loss	Epoch	Step	Cer	Cer Mecab	Cer Ortho	Validation Loss	Wer	Wer Ortho
4.4492	0.03	300	306.2865	306.2865	306.2865	4.4784	442.4364	442.4364
1.0895	0.06	600	39.8357	41.7989	39.8427	0.9371	51.6909	51.5818
0.8748	0.09	900	33.7580	34.7327	33.7719	0.7186	47.4	47.4182
0.73	0.12	1200	27.1651	28.9265	27.1651	0.6159	37.2364	37.2364
0.6601	0.15	1500	20.8995	21.8950	21.0039	0.5812	31.2364	31.2364
0.606	0.18	1800	26.0164	27.2626	26.0164	0.5279	35.7273	35.7273
0.5825	0.21	2100	19.9109	20.6419	19.9039	0.5185	29.3455	29.3455
0.5231	0.24	2400	18.9710	19.9248	19.0128	0.4767	28.2364	28.2727
0.5058	0.27	2700	23.7121	25.1880	23.7190	0.4539	32.8727	32.8909
0.4752	0.3	3000	17.0217	18.2818	17.0217	0.4025	23.5091	23.5091
0.4351	0.33	3300	29.5879	30.1657	29.5879	0.4177	42.2364	42.2364
0.4392	0.36	3600	16.1933	16.7502	16.2002	0.3614	24.3636	24.3455
0.4123	0.39	3900	14.2648	15.0585	14.2648	0.3699	22.1273	22.1091
0.3981	0.42	4200	13.4851	14.0769	13.5060	0.3443	20.6727	20.7091
0.3985	0.45	4500	12.8168	13.2414	12.8168	0.3330	19.4000	19.4000
0.3521	0.48	4800	12.6636	13.2832	12.6636	0.3233	19.0545	19.0545
0.3453	0.51	5100	10.7212	11.3200	10.7212	0.2926	17.0909	17.0909
0.3026	0.54	5400	16.7850	18.4280	16.7920	0.2860	17.1818	17.1818
0.3408	0.57	5700	11.2434	11.7516	11.2434	0.2526	17.5636	17.5636
0.3101	0.6	6000	10.8605	11.4105	10.8674	0.2464	17.0	17.0182
0.2953	0.63	6300	10.5333	10.9997	10.5333	0.2389	16.1091	16.1091
0.2804	0.66	6600	10.9649	11.3478	10.9719	0.2305	16.6909	16.6909
0.2611	0.69	6900	9.9206	10.3523	9.9206	0.2216	15.5091	15.5091
0.2429	0.72	7200	8.7928	9.3498	8.7928	0.2070	13.5091	13.5091
0.2467	0.75	7500	8.1036	8.5352	8.1036	0.2019	12.8182	12.8182
0.253	0.78	7800	8.4099	8.8067	8.4099	0.1979	13.1455	13.1455
0.2407	0.81	8100	7.4283	7.6859	7.4283	0.1825	11.6000	11.6000
0.2206	0.84	8400	8.9042	9.1618	8.9042	0.1779	13.4727	13.4727
0.2123	0.87	8700	7.4909	7.7694	7.4909	0.1769	11.7273	11.7273
0.1976	0.9	9000	9.1131	9.4055	9.1131	0.1665	13.9273	13.9273
0.1757	1.0259	9300	6.6903	6.9618	6.6903	0.1590	10.5818	10.5818
0.1406	1.0559	9600	7.4561	7.7068	7.4561	0.1544	11.6545	11.6545
0.1422	1.0859	9900	6.2657	6.5093	6.2657	0.1532	10.1273	10.1273

Framework versions

Transformers 4.42.0.dev0
Pytorch 2.3.1+cu121
Datasets 2.19.2
Tokenizers 0.19.1

sin2piusc
/

whisper-med_15k