whisper-large-v3-chichewa-variant-b-normalized-transcript

This model is a fine-tuned version of openai/whisper-large-v3 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 7.5e-06
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 0.05
training_steps: 3000

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
6.1867	0.9259	100	1.4766	83.4464	37.9517
4.3043	1.8519	200	1.1411	75.8189	36.3040
3.2145	2.7778	300	1.0356	62.7866	29.1714
2.3829	3.7037	400	1.0172	61.5115	28.0164
1.7284	4.6296	500	1.0397	59.7801	27.8698
1.2327	5.5556	600	1.0958	60.7745	28.9654
0.7728	6.4815	700	1.1447	58.3645	26.6423
0.5727	7.4074	800	1.1959	57.0192	26.0145
0.3477	8.3333	900	1.2624	58.1072	26.7461
0.2305	9.2593	1000	1.3064	57.1479	25.9058
0.1460	10.1852	1100	1.3492	57.2649	25.7987
0.1251	11.1111	1200	1.4039	57.3701	25.8580
0.1346	12.0370	1300	1.4000	56.7735	25.5235
0.0917	12.9630	1400	1.4034	56.6097	25.8201
0.0915	13.8889	1500	1.4139	56.4109	25.3736
0.0489	14.8148	1600	1.4882	56.4810	25.7773
0.0301	15.7407	1700	1.5243	56.2588	25.5400
0.0265	16.6667	1800	1.5395	56.1886	25.2665
0.0177	17.5926	1900	1.5489	54.9719	24.4625
0.0142	18.5185	2000	1.5950	55.5218	24.9469
0.0073	19.4444	2100	1.6268	55.7674	25.1298
0.0065	20.3704	2200	1.6454	56.0014	24.8892
0.0052	21.2963	2300	1.6621	55.0304	24.8035
0.0031	22.2222	2400	1.6772	56.0482	25.0424

Safetensors

Model size

2B params

Tensor type

F32

Base model

Finetuned

(873)

this model