TSukiLen's picture
Model save
020db23 verified
metadata
library_name: transformers
license: apache-2.0
base_model: openai/whisper-small
tags:
  - generated_from_trainer
datasets:
  - common_voice_11_0
metrics:
  - wer
model-index:
  - name: whisper-small-chinese-tw-minnan-hanzi
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_11_0
          type: common_voice_11_0
          config: nan-tw
          split: test
          args: nan-tw
        metrics:
          - name: Wer
            type: wer
            value: 86.71399594320486

whisper-small-chinese-tw-minnan-hanzi

This model is a fine-tuned version of openai/whisper-small on the common_voice_11_0 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2235
  • Wer: 86.7140
  • Cer: 62.5226

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 5000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.0802 3.6364 1000 1.1582 91.0751 80.0120
0.0014 7.2727 2000 1.1876 85.8012 61.7399
0.0002 10.9091 3000 1.1944 85.4970 61.7098
0.0002 14.5455 4000 1.2139 85.8012 61.8603
0.0001 18.1818 5000 1.2235 86.7140 62.5226

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.4.0+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3