metadata

license: apache-2.0
tags:
  - whisper-event
  - generated_from_trainer
datasets:
  - google/fleurs
metrics:
  - wer
model-index:
  - name: Whisper small Luxembourgish
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: google/fleurs lb_lu
          type: google/fleurs
          config: lb_lu
          split: test
        metrics:
          - name: Wer
            type: wer
            value: 39.49904580152672

Whisper small Luxembourgish

This model is a fine-tuned version of bofenghuang/whisper-small-cv11-german-punct on the google/fleurs lb_lu dataset. It achieves the following results on the evaluation set:

Loss: 1.1857
Wer: 39.4990

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 16
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 64
total_eval_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 5000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.0618	38.46	500	1.0104	43.2968
0.0055	76.92	1000	1.0684	40.1288
0.0024	115.38	1500	1.1056	40.9447
0.0014	153.85	2000	1.1280	39.7615
0.0013	192.31	2500	1.1415	39.9857
0.0008	230.77	3000	1.1573	39.7996
0.0006	269.23	3500	1.1682	40.0095
0.0006	307.69	4000	1.1769	39.7233
0.0005	346.15	4500	1.1826	39.5134
0.0004	384.62	5000	1.1857	39.4990

Framework versions

Transformers 4.25.1
Pytorch 1.13.0+cu117
Datasets 2.7.1
Tokenizers 0.13.2