mms-1b-bemgen-combined-model

This model is a fine-tuned version of facebook/mms-1b-all on the BEMGEN - BEM dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
6.8762	0.0516	100	0.9801	0.9386
0.5788	0.1031	200	0.3466	0.5014
0.4891	0.1547	300	0.3220	0.4820
0.4386	0.2063	400	0.3071	0.4802
0.4272	0.2579	500	0.3056	0.4988
0.3982	0.3094	600	0.2981	0.4626
0.425	0.3610	700	0.2977	0.4631
0.4036	0.4126	800	0.2897	0.4438
0.3903	0.4642	900	0.2878	0.4627
0.3758	0.5157	1000	0.2926	0.4523
0.3861	0.5673	1100	0.2807	0.4410
0.3763	0.6189	1200	0.2790	0.4331
0.3984	0.6704	1300	0.2803	0.4312
0.373	0.7220	1400	0.2802	0.4246
0.3848	0.7736	1500	0.2759	0.4752
0.4235	0.8252	1600	0.2738	0.4268
0.3704	0.8767	1700	0.2688	0.4219
0.3911	0.9283	1800	0.2653	0.4201
0.3954	0.9799	1900	0.2697	0.4482
0.352	1.0315	2000	0.2654	0.4154
0.3808	1.0830	2100	0.2631	0.4051
0.3681	1.1346	2200	0.2610	0.4219
0.3355	1.1862	2300	0.2608	0.4098
0.342	1.2378	2400	0.2602	0.4082
0.347	1.2893	2500	0.2628	0.4055
0.3409	1.3409	2600	0.2588	0.4129
0.3423	1.3925	2700	0.2617	0.4192
0.3341	1.4440	2800	0.2578	0.4055
0.3425	1.4956	2900	0.2580	0.3988
0.337	1.5472	3000	0.2568	0.4071
0.3412	1.5988	3100	0.2552	0.3993
0.3837	1.6503	3200	0.2622	0.4084
0.3372	1.7019	3300	0.2548	0.3991
0.3394	1.7535	3400	0.2535	0.4061
0.3542	1.8051	3500	0.2512	0.3927
0.3368	1.8566	3600	0.2580	0.4004
0.3807	1.9082	3700	0.2490	0.3975
0.3454	1.9598	3800	0.2514	0.4002
0.3456	2.0113	3900	0.2457	0.3931
0.3202	2.0629	4000	0.2466	0.3916
0.3233	2.1145	4100	0.2495	0.3975
0.3052	2.1661	4200	0.2478	0.3899