ASR / README.md
elmamounedieye's picture
End of training
106284d verified
metadata
language:
  - multilingual
license: apache-2.0
base_model: serge-wilson/whisper-small-wolof
tags:
  - generated_from_trainer
datasets:
  - audiofolder
metrics:
  - wer
model-index:
  - name: Whisper Wolof Lengo AI V5
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: audiofolder
          type: audiofolder
          config: default
          split: None
          args: default
        metrics:
          - name: Wer
            type: wer
            value: 36.047170881052274

Whisper Wolof Lengo AI V5

This model is a fine-tuned version of serge-wilson/whisper-small-wolof on the audiofolder dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3569
  • Wer: 36.0472
  • Cer: 22.5967

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-05
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • training_steps: 1990
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
1.2672 1.0 208 1.2009 85.3838 63.2065
0.875 2.0 416 0.8801 95.6117 69.2841
0.5964 3.0 624 0.6979 88.4681 63.1476
0.3953 4.0 832 0.6112 69.2255 57.6000
0.2465 5.0 1040 0.5015 55.4825 44.1314
0.161 6.0 1248 0.4401 53.7476 36.3715
0.0903 7.0 1456 0.4081 47.1822 31.0320
0.0553 8.0 1664 0.3751 44.7783 29.2044
0.024 9.0 1872 0.3604 38.7686 25.2606
0.011 9.57 1990 0.3569 36.0472 22.5967

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.2+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2