mms-1b-bemgen-combined-model

This model is a fine-tuned version of facebook/mms-1b-all on the BEMGEN - BEM dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2477
  • Wer: 0.3897

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
6.8762 0.0516 100 0.9801 0.9386
0.5788 0.1031 200 0.3466 0.5014
0.4891 0.1547 300 0.3220 0.4820
0.4386 0.2063 400 0.3071 0.4802
0.4272 0.2579 500 0.3056 0.4988
0.3982 0.3094 600 0.2981 0.4626
0.425 0.3610 700 0.2977 0.4631
0.4036 0.4126 800 0.2897 0.4438
0.3903 0.4642 900 0.2878 0.4627
0.3758 0.5157 1000 0.2926 0.4523
0.3861 0.5673 1100 0.2807 0.4410
0.3763 0.6189 1200 0.2790 0.4331
0.3984 0.6704 1300 0.2803 0.4312
0.373 0.7220 1400 0.2802 0.4246
0.3848 0.7736 1500 0.2759 0.4752
0.4235 0.8252 1600 0.2738 0.4268
0.3704 0.8767 1700 0.2688 0.4219
0.3911 0.9283 1800 0.2653 0.4201
0.3954 0.9799 1900 0.2697 0.4482
0.352 1.0315 2000 0.2654 0.4154
0.3808 1.0830 2100 0.2631 0.4051
0.3681 1.1346 2200 0.2610 0.4219
0.3355 1.1862 2300 0.2608 0.4098
0.342 1.2378 2400 0.2602 0.4082
0.347 1.2893 2500 0.2628 0.4055
0.3409 1.3409 2600 0.2588 0.4129
0.3423 1.3925 2700 0.2617 0.4192
0.3341 1.4440 2800 0.2578 0.4055
0.3425 1.4956 2900 0.2580 0.3988
0.337 1.5472 3000 0.2568 0.4071
0.3412 1.5988 3100 0.2552 0.3993
0.3837 1.6503 3200 0.2622 0.4084
0.3372 1.7019 3300 0.2548 0.3991
0.3394 1.7535 3400 0.2535 0.4061
0.3542 1.8051 3500 0.2512 0.3927
0.3368 1.8566 3600 0.2580 0.4004
0.3807 1.9082 3700 0.2490 0.3975
0.3454 1.9598 3800 0.2514 0.4002
0.3456 2.0113 3900 0.2457 0.3931
0.3202 2.0629 4000 0.2466 0.3916
0.3233 2.1145 4100 0.2495 0.3975
0.3052 2.1661 4200 0.2478 0.3899

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
30
Safetensors
Model size
965M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for csikasote/mms-1b-bemgen-combined-model

Finetuned
(219)
this model