metadata

language:
  - ar
license: apache-2.0
base_model: openai/whisper-small
tags:
  - generated_from_trainer
datasets:
  - Arbi-Houssem/Tunisian_dataset_STT-TTS
metrics:
  - wer
model-index:
  - name: Whisper Tunisien
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Tunisian_dataset_STT-TTS
          type: Arbi-Houssem/Tunisian_dataset_STT-TTS
          args: 'config: ar, split: test'
        metrics:
          - name: Wer
            type: wer
            value: 111.1901681759379

Whisper Tunisien

This model is a fine-tuned version of openai/whisper-small on the Tunisian_dataset_STT-TTS dataset. It achieves the following results on the evaluation set:

Loss: 3.9898
Wer: 111.1902

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 12
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 4000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.2639	5.7471	500	2.8790	103.3635
0.0292	11.4943	1000	3.4175	131.9534
0.0045	17.2414	1500	3.7135	106.2743
0.0021	22.9885	2000	3.7732	119.7930
0.0012	28.7356	2500	3.8911	124.9677
0.0004	34.4828	3000	3.9580	130.2717
0.0003	40.2299	3500	3.9781	108.7969
0.0003	45.9770	4000	3.9898	111.1902

Framework versions

Transformers 4.42.0.dev0
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1