raiyan007's picture
y
3c616f7 verified
metadata
language:
  - bn
license: apache-2.0
base_model: openai/whisper-base
tags:
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_16_1
metrics:
  - wer
model-index:
  - name: Whisper Base Bn - Raiyan Ahmed
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 16.1
          type: mozilla-foundation/common_voice_16_1
          config: bn
          split: None
          args: 'config: bn, split: test'
        metrics:
          - name: Wer
            type: wer
            value: 33.449797070760546

Whisper Base Bn - Raiyan Ahmed

This model is a fine-tuned version of openai/whisper-base on the Common Voice 16.1 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1074
  • Wer: 33.4498

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 26
  • eval_batch_size: 46
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 10000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.2369 0.6365 1000 0.2433 62.1881
0.1242 1.2731 2000 0.1734 49.4369
0.1022 1.9096 3000 0.1197 39.0531
0.046 2.5461 4000 0.1067 34.5497
0.0702 2.6247 5000 0.1210 38.4777
0.1028 1.5748 6000 0.1484 44.2750
0.0772 1.8373 7000 0.1323 40.2388
0.0648 2.0997 8000 0.1205 39.1165
0.0367 2.3622 9000 0.1154 35.6332
0.0249 2.6247 10000 0.1074 33.4498

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1