--- license: apache-2.0 language: fr library_name: transformers thumbnail: null tags: - automatic-speech-recognition - hf-asr-leaderboard - robust-speech-event - CTC - Wav2vec2 datasets: - common_voice - mozilla-foundation/common_voice_11_0 - facebook/multilingual_librispeech - polinaeterna/voxpopuli - gigant/african_accented_french metrics: - wer model-index: - name: Fine-tuned Wav2Vec2 XLS-R 1B model for ASR in French results: - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Common Voice 11.0 type: mozilla-foundation/common_voice_11_0 args: fr metrics: - name: Test WER type: wer value: 14.80 - name: Test WER (+LM) type: wer value: 12.61 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Multilingual LibriSpeech (MLS) type: facebook/multilingual_librispeech args: french metrics: - name: Test WER type: wer value: 9.39 - name: Test WER (+LM) type: wer value: 8.06 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: VoxPopuli type: polinaeterna/voxpopuli args: fr metrics: - name: Test WER type: wer value: 11.80 - name: Test WER (+LM) type: wer value: 9.94 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: African Accented French type: gigant/african_accented_french args: fr metrics: - name: Test WER type: wer value: 22.98 - name: Test WER (+LM) type: wer value: 20.73 - task: name: Automatic Speech Recognition type: automatic-speech-recognition dataset: name: Robust Speech Event - Dev Data type: speech-recognition-community-v2/dev_data args: fr metrics: - name: Test WER type: wer value: 17.88 - name: Test WER (+LM) type: wer value: 14.01 --- # Fine-tuned Wav2Vec2 XLS-R 1B model for ASR in French ![Model architecture](https://img.shields.io/badge/Model_Architecture-Wav2Vec2--CTC-lightgrey) ![Model size](https://img.shields.io/badge/Params-962M-lightgrey) ![Language](https://img.shields.io/badge/Language-French-lightgrey) This model is a fine-tuned version of [facebook/wav2vec2-xls-r-1b](https://huggingface.co/facebook/wav2vec2-xls-r-1b) on French using the train and validation splits of [Common Voice 11.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0), [Multilingual LibriSpeech](https://huggingface.co/datasets/facebook/multilingual_librispeech), [Voxpopuli](https://github.com/facebookresearch/voxpopuli), [Multilingual TEDx](http://www.openslr.org/100), [MediaSpeech](https://www.openslr.org/108), and [African Accented French](https://huggingface.co/datasets/gigant/african_accented_french) on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz. *Genrally we advise to use [bofenghuang/asr-wav2vec2-ctc-french](https://huggingface.co/bofenghuang/asr-wav2vec2-ctc-french) because it has the smaller model size and the better performance.*