Whisper Larget V3 GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-large-v3 on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop dataset. It achieves the following results on the evaluation set:

Loss: 1.0552
Bleu: 11.86
Chrf: 28.37
Wer: 127.1049

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.03
training_steps: 8000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
2.5918	0.0138	100	0.61	8.48	2.1791	238.2260
2.476	0.0276	200	0.63	10.43	2.1702	275.7317
2.2358	0.0414	300	4.76	19.98	2.0420	120.0810
2.1778	0.0552	400	2.78	12.85	1.9506	86.8528
1.9779	0.0690	500	4.53	18.47	1.8609	137.1905
1.9435	0.0828	600	6.67	22.37	1.7726	82.4403
1.7928	0.0966	700	4.54	17.32	1.7445	133.8586
1.9004	0.1103	800	1.58	12.65	1.7290	195.2724
1.7856	0.1241	900	4.84	17.5	1.6990	83.9262
1.6783	0.1379	1000	8.46	24.24	1.6329	113.5074
1.6095	0.1517	1100	7.35	20.22	1.6083	102.5214
1.6328	0.1655	1200	11.46	25.29	1.5267	76.5871
1.6093	0.1793	1300	6.51	17.77	1.4947	112.4719
1.5776	0.1931	1400	6.21	19.86	1.4952	90.6348
1.4767	0.2069	1500	4.86	19.57	1.4515	145.1148
1.3447	0.2207	1600	6.77	19.96	1.3974	90.5448
1.3273	0.2345	1700	4.77	16.31	1.4323	152.1837
1.4253	0.2483	1800	3.95	15.66	1.3598	173.2553
1.3505	0.2621	1900	11.25	23.4	1.3517	80.3692
1.2593	0.2759	2000	12.71	26.55	1.3236	77.5777
1.2483	0.2897	2100	17.88	32.0	1.2825	73.3003
1.161	0.3034	2200	10.08	20.69	1.2567	115.8937
1.1597	0.3172	2300	8.61	19.54	1.2581	93.8766
1.0937	0.3310	2400	12.37	25.67	1.2577	99.0095
1.0606	0.3448	2500	6.46	23.47	1.2228	172.9401
1.039	0.3586	2600	9.55	21.56	1.2186	89.7794
1.0193	0.3724	2700	3.08	17.58	1.1844	281.8100
1.1153	0.3862	2800	2.69	18.38	1.1693	350.2927
1.012	0.4	2900	3.56	14.74	1.1233	194.9122
0.8936	0.4138	3000	5.21	17.38	1.1161	158.3521
0.8893	0.4276	3100	11.52	25.02	1.1119	80.9095
0.9491	0.4414	3200	5.93	20.91	1.1213	174.0207
0.9233	0.4552	3300	5.54	20.95	1.0656	186.2224
0.8915	0.4690	3400	7.26	23.99	1.0736	155.6506
0.8296	0.4828	3500	6.74	21.46	1.0461	146.1054
0.8163	0.4966	3600	11.35	24.11	1.0706	101.8010
0.8115	0.5103	3700	12.84	26.92	1.0199	115.8487
0.8245	0.5241	3800	12.47	24.29	1.0163	101.9361
0.7988	0.5379	3900	15.29	28.54	0.9891	92.7960
0.769	0.5517	4000	15.23	28.15	0.9885	92.7060
0.9048	0.5655	4100	1.1588	11.58	25.38	84.6466
1.015	0.5793	4200	1.1907	8.93	18.79	86.6276
0.9254	0.5931	4300	1.1832	7.96	20.76	80.2792
0.9458	0.6069	4400	1.1789	12.03	25.59	82.6204
0.9783	0.6207	4500	1.1607	7.62	20.23	100.8555
0.9935	0.6345	4600	1.2477	8.89	21.49	81.7650
0.9747	0.6483	4700	1.1994	14.51	28.26	76.5421
0.9794	0.6621	4800	1.1219	16.11	27.49	81.1796
0.8919	0.6759	4900	1.1540	5.19	19.48	139.9820
0.8333	0.6897	5000	1.1388	9.38	20.8	84.6015
0.9083	0.7034	5100	1.1244	6.71	22.08	176.0018
0.8039	0.7172	5200	1.1072	11.42	21.77	107.2040
0.8064	0.7310	5300	1.0705	8.89	17.34	122.8276
0.8319	0.7448	5400	1.0968	7.64	24.95	170.0585
0.7984	0.7586	5500	1.1110	10.44	24.66	79.2886
0.7288	0.7724	5600	1.0820	10.4	23.09	82.5754
0.8128	0.7862	5700	1.1287	12.13	25.86	96.9833
0.7016	0.8	5800	1.0698	4.84	21.49	207.7893
0.7456	0.8138	5900	1.0809	5.53	22.33	204.9077
0.7575	0.8276	6000	1.0611	6.24	27.03	196.4430
0.6076	0.8414	6100	1.0868	7.93	22.14	134.7591
0.6913	0.8552	6200	1.0786	8.25	19.46	84.1963
0.6251	0.8690	6300	1.0372	8.69	21.0	83.4309
0.6357	0.8828	6400	1.0408	13.83	25.16	83.2508
0.666	0.8966	6500	1.0528	9.45	21.12	101.8910
0.6397	0.9103	6600	1.0394	8.21	20.5	118.1450
0.6475	0.9241	6700	1.0438	4.72	20.26	191.9856
0.642	0.9379	6800	1.0421	4.84	21.12	200.1801
0.6867	0.9517	6900	1.0231	5.44	21.48	214.3629
0.5254	0.9655	7000	1.0436	9.96	24.2	131.6074
0.599	0.9793	7100	1.0231	16.23	30.07	86.4475
0.6589	0.9931	7200	1.0365	12.51	26.46	107.5191
0.3222	1.0069	7300	1.0790	9.22	24.16	131.4723
0.3309	1.0207	7400	1.1012	7.17	25.54	166.2314
0.3402	1.0345	7500	1.0839	14.56	28.4	98.1990
0.3004	1.0483	7600	1.0615	15.49	29.84	104.0522
0.2561	1.0621	7700	1.0724	11.72	28.66	125.1688
0.3021	1.0759	7800	1.0592	10.85	28.55	130.3917
0.2932	1.0897	7900	1.0554	11.62	28.17	123.8631
0.2619	1.1034	8000	1.0552	11.86	28.37	127.1049

Framework versions

Transformers 4.41.2
Pytorch 2.1.2+git70dfd51
Datasets 2.20.0
Tokenizers 0.19.1

ymoslem
/

whisper-large-v3-ga2en-v3.1.0-r

Whisper Larget V3 GA-EN Speech Translation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-large-v3-ga2en-v3.1.0-r

Datasets used to train ymoslem/whisper-large-v3-ga2en-v3.1.0-r

Evaluation results