jonatasgrosman's picture
update README
65ea5d5
|
raw
history blame
No virus
2.86 kB
metadata
language:
  - pt
license: apache-2.0
tags:
  - whisper-event
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_11_0
metrics:
  - wer
model-index:
  - name: Whisper Large Portuguese
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: mozilla-foundation/common_voice_11_0 pt
          type: mozilla-foundation/common_voice_11_0
          config: pt
          split: test
          args: pt
        metrics:
          - name: WER
            type: wer
            value: 4.816664144852979
          - name: CER
            type: cer
            value: 1.6052355927195898

Whisper Large Portuguese

This model is a fine-tuned version of openai/whisper-large-v2 on Portuguese using the train and validation splits of Common Voice 11. Not all validation split data were used during training, I extracted 1k samples from the validation split to be used for evaluation during fine-tuning. When using this model, make sure that your speech input is sampled at 16kHz.

Usage


from transformers import pipeline

transcriber = pipeline(
  "automatic-speech-recognition", 
  model="jonatasgrosman/whisper-large-pt-cv11"
)

transcriber.model.config.forced_decoder_ids = (
  transcriber.tokenizer.get_decoder_prompt_ids(
    language="pt" 
    task="transcribe"
  )
)

transcription = transcriber("path/to/my_audio.wav")

Evaluation

Common Voice 11

CER WER
jonatasgrosman/whisper-large-pt-cv11 2.52 9.56
jonatasgrosman/whisper-large-pt-cv11 + text normalization 1.60 4.82
openai/whisper-large-v2 4.32 13.92
openai/whisper-large-v2 + text normalization 2.84 7.02

Fleurs

CER WER
jonatasgrosman/whisper-large-pt-cv11 4.88 12.08
jonatasgrosman/whisper-large-pt-cv11 + text normalization 5.46 8.57
jonatasgrosman/whisper-large-pt-cv11 + text normalization + removal of samples with numbers 3.36 6.05
openai/whisper-large-v2 3.52 10.55
openai/whisper-large-v2 + text normalization 4.19 7.04
openai/whisper-large-v2 + text normalization + removal of samples with numbers 3.56 6.15