Whisper Medium GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-medium on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia dataset. It achieves the following results on the evaluation set:

Loss: 1.3818
Bleu: 33.79
Chrf: 51.67
Wer: 61.6839

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.03
training_steps: 10000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
2.4382	0.0109	100	3.07	16.85	2.1114	171.0491
2.6151	0.0219	200	6.25	23.02	2.0207	126.9698
2.5699	0.0328	300	5.71	24.03	1.8660	155.5606
2.3084	0.0438	400	9.87	28.45	1.8084	129.0860
2.3327	0.0547	500	12.01	31.92	1.7823	102.7915
2.1495	0.0657	600	13.97	32.4	1.7238	98.6042
2.2164	0.0766	700	11.21	33.19	1.6538	146.0153
2.0071	0.0876	800	14.34	35.72	1.7038	96.9383
1.8334	0.0985	900	16.51	37.23	1.6329	96.8032
1.8359	0.1095	1000	17.87	35.94	1.6637	84.4665
1.7703	0.1204	1100	19.54	39.02	1.5626	79.7839
1.5805	0.1314	1200	20.19	40.4	1.5618	77.8028
1.4545	0.1423	1300	13.88	35.53	1.5599	112.5619
1.5177	0.1533	1400	18.79	40.11	1.4880	84.6916
1.6335	0.1642	1500	16.41	38.64	1.4996	96.9833
1.3809	0.1752	1600	18.3	40.17	1.4739	101.8910
1.2694	0.1861	1700	22.53	43.15	1.4498	76.9923
1.2321	0.1970	1800	19.92	42.59	1.4163	84.6015
1.1969	0.2080	1900	21.63	44.92	1.4137	85.3670
1.2023	0.2189	2000	20.42	41.57	1.3530	82.8906
1.1676	0.2299	2100	22.82	44.23	1.3723	78.1180
1.0332	0.2408	2200	26.73	44.75	1.3641	70.2386
0.8589	0.2518	2300	26.94	46.89	1.3344	72.7600
0.9829	0.2627	2400	28.15	47.21	1.3181	69.1130
0.8228	0.2737	2500	26.98	47.41	1.3049	74.0207
0.7667	0.2846	2600	30.0	49.42	1.2698	65.1058
0.8749	0.2956	2700	27.91	47.67	1.2878	66.9518
0.7504	0.3065	2800	32.03	50.35	1.2670	63.6650
0.7069	0.3175	2900	30.7	49.53	1.2771	64.4304
0.7199	0.3284	3000	30.21	48.93	1.2658	65.5561
0.6207	0.3394	3100	30.82	49.11	1.2687	66.0063
0.5995	0.3503	3200	31.99	50.94	1.2207	62.9446
0.6294	0.3612	3300	31.05	50.85	1.2422	64.7006
0.4612	0.3722	3400	33.1	51.82	1.2203	61.9090
0.5138	0.3831	3500	32.08	51.86	1.2007	63.0797
0.5059	0.3941	3600	31.8	51.19	1.2130	63.9352
0.417	0.4050	3700	32.45	51.41	1.1975	62.2692
0.2958	0.4160	3800	29.29	51.39	1.2046	62.7645
0.393	0.4269	3900	28.95	51.45	1.1968	63.1697
0.3858	0.4379	4000	29.54	51.58	1.1929	62.4043
0.5416	0.4488	4100	1.3522	27.29	43.94	67.9424
0.6644	0.4598	4200	1.4191	23.16	44.45	77.3976
0.5246	0.4707	4300	1.4221	22.26	44.91	77.2625
0.614	0.4817	4400	1.3956	26.9	46.15	70.4638
0.5973	0.4926	4500	1.4152	25.55	45.51	76.7222
0.544	0.5036	4600	1.4091	23.54	47.87	79.1085
0.5975	0.5145	4700	1.4644	21.85	42.69	78.5682
0.4675	0.5255	4800	1.4598	22.93	43.69	76.9023
0.7959	0.5364	4900	1.3884	24.91	44.98	74.5610
0.5936	0.5473	5000	1.4235	26.91	44.88	69.0680
0.4631	0.5583	5100	1.4002	25.77	45.81	74.0207
0.5188	0.5692	5200	1.4405	28.37	45.48	66.2765
0.4675	0.5802	5300	1.4045	21.1	43.11	92.1207
0.4214	0.5911	5400	1.4250	25.62	44.82	72.2197
0.4592	0.6021	5500	1.4107	27.24	46.44	70.0585
0.4809	0.6130	5600	1.3896	27.93	47.42	69.5182
0.4364	0.6240	5700	1.3808	25.84	47.47	77.6227
0.3333	0.6349	5800	1.4203	26.46	47.08	72.4899
0.3345	0.6459	5900	1.4763	23.1	44.6	81.2247
0.3368	0.6568	6000	1.4182	24.55	45.76	80.5493
0.3061	0.6678	6100	1.4218	23.1	45.97	81.3597
0.324	0.6787	6200	1.4453	28.26	47.06	67.5822
0.2667	0.6897	6300	1.4494	27.87	46.14	69.0230
0.2845	0.7006	6400	1.4448	26.39	46.72	71.4543
0.3125	0.7115	6500	1.4643	27.81	46.45	70.0135
0.264	0.7225	6600	1.4244	26.27	47.75	72.7600
0.2426	0.7334	6700	1.4081	25.84	46.68	76.4070
0.2174	0.7444	6800	1.4036	30.67	47.92	65.8262
0.2265	0.7553	6900	1.4174	28.11	49.12	71.2292
0.2016	0.7663	7000	1.4341	30.43	49.47	65.9163
0.1865	0.7772	7100	1.3690	32.05	49.5	63.1697
0.2148	0.7882	7200	1.3603	32.29	49.91	63.8901
0.2126	0.7991	7300	1.4046	32.07	49.31	63.6650
0.1594	0.8101	7400	1.4122	29.94	47.48	65.5110
0.1295	0.8210	7500	1.4243	30.14	49.79	65.7812
0.1378	0.8320	7600	1.4334	31.23	49.42	65.9613
0.1701	0.8429	7700	1.4149	31.04	49.95	65.6461
0.1102	0.8539	7800	1.4082	31.37	50.2	63.7100
0.1267	0.8648	7900	1.3642	32.86	50.83	60.8285
0.1384	0.8758	8000	1.3860	33.47	49.61	59.8829
0.1128	0.8867	8100	1.3840	32.78	50.04	61.8190
0.1197	0.8976	8200	1.3641	33.69	50.94	61.8190
0.1181	0.9086	8300	1.3913	32.0	49.65	63.5299
0.0866	0.9195	8400	1.4171	30.39	48.48	68.0324
0.0784	0.9305	8500	1.3850	32.27	49.32	63.3949
0.092	0.9414	8600	1.3880	33.78	51.13	61.2787
0.0685	0.9524	8700	1.3876	34.33	51.23	61.1887
0.0783	0.9633	8800	1.4010	33.4	48.9	62.5844
0.0735	0.9743	8900	1.4035	33.72	49.01	61.5038
0.0875	0.9852	9000	1.4064	30.44	49.06	67.5371
0.0822	0.9962	9100	1.3803	34.64	51.51	60.5133
0.041	1.0071	9200	1.3678	34.66	52.06	59.4327
0.0351	1.0181	9300	1.3739	33.88	51.16	61.3688
0.0368	1.0290	9400	1.3846	35.2	51.73	60.4232
0.035	1.0400	9500	1.3753	34.23	51.32	60.8735
0.0277	1.0509	9600	1.3788	35.0	52.59	60.0180
0.0247	1.0619	9700	1.3914	34.69	51.7	60.2882
0.0321	1.0728	9800	1.3804	34.63	51.91	60.6033
0.0286	1.0837	9900	1.3795	33.92	51.64	61.8640
0.0239	1.0947	10000	1.3818	33.79	51.67	61.6839

Framework versions

Transformers 4.41.2
Pytorch 2.2.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

ymoslem
/

whisper-medium-ga2en-v5.3.1-10k-r

Whisper Medium GA-EN Speech Translation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-medium-ga2en-v5.3.1-10k-r

Datasets used to train ymoslem/whisper-medium-ga2en-v5.3.1-10k-r

Evaluation results