librarian-bot's picture
Librarian Bot: Add base_model information to model
d186041
metadata
language: fr
license: apache-2.0
library_name: transformers
tags:
  - automatic-speech-recognition
  - hf-asr-leaderboard
  - robust-speech-event
  - CTC
  - Wav2vec2
datasets:
  - common_voice
  - mozilla-foundation/common_voice_11_0
  - facebook/multilingual_librispeech
  - polinaeterna/voxpopuli
  - gigant/african_accented_french
metrics:
  - wer
base_model: facebook/wav2vec2-xls-r-1b
model-index:
  - name: Fine-tuned Wav2Vec2 XLS-R 1B model for ASR in French
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: Common Voice 11.0
          type: mozilla-foundation/common_voice_11_0
          args: fr
        metrics:
          - type: wer
            value: 14.8
            name: Test WER
          - type: wer
            value: 12.61
            name: Test WER (+LM)
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: Multilingual LibriSpeech (MLS)
          type: facebook/multilingual_librispeech
          args: french
        metrics:
          - type: wer
            value: 9.39
            name: Test WER
          - type: wer
            value: 8.06
            name: Test WER (+LM)
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: VoxPopuli
          type: polinaeterna/voxpopuli
          args: fr
        metrics:
          - type: wer
            value: 11.8
            name: Test WER
          - type: wer
            value: 9.94
            name: Test WER (+LM)
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: African Accented French
          type: gigant/african_accented_french
          args: fr
        metrics:
          - type: wer
            value: 22.98
            name: Test WER
          - type: wer
            value: 20.73
            name: Test WER (+LM)
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: Robust Speech Event - Dev Data
          type: speech-recognition-community-v2/dev_data
          args: fr
        metrics:
          - type: wer
            value: 17.88
            name: Test WER
          - type: wer
            value: 14.01
            name: Test WER (+LM)

Fine-tuned Wav2Vec2 XLS-R 1B model for ASR in French

Model architecture Model size Language

This model is a fine-tuned version of facebook/wav2vec2-xls-r-1b on French using the train and validation splits of Common Voice 11.0, Multilingual LibriSpeech, Voxpopuli, Multilingual TEDx, MediaSpeech, and African Accented French on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.

Genrally we advise to use bofenghuang/asr-wav2vec2-ctc-french because it has the smaller model size and the better performance.