metadata

language:
  - cs
license: apache-2.0
tags:
  - automatic-speech-recognition
  - mozilla-foundation/common_voice_8_0
  - generated_from_trainer
  - cs
  - robust-speech-event
  - model_for_talk
datasets:
  - mozilla-foundation/common_voice_8_0
model-index:
  - name: sammy786/wav2vec2-xlsr-czech
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Common Voice 8
          type: mozilla-foundation/common_voice_8_0
          args: fi
        metrics:
          - name: Test WER
            type: wer
            value: 11.533
          - name: Test CER
            type: cer
            value: 2.61
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Robust Speech Event - Dev Data
          type: speech-recognition-community-v2/dev_data
          args: fi
        metrics:
          - name: Test WER
            type: wer
            value: 11.533
          - name: Test CER
            type: cer
            value: 2.611

sammy786/wav2vec2-xlsr-czech

This model is a fine-tuned version of facebook/wav2vec2-xls-r-1b on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - cs dataset. It achieves the following results on evaluation set (which is 10 percent of train data set merged with other and dev datasets):

Loss: 9.7555
Wer: 18.4731

Model description

"facebook/wav2vec2-xls-r-1b" was finetuned.

Intended uses & limitations

More information needed

Training and evaluation data

Training data - Common voice Finnish train.tsv, dev.tsv, invalidated.tsv and other.tsv

Training procedure

For creating the train dataset, all possible datasets were appended and 90-10 split was used.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.000045637994662983496
train_batch_size: 8
eval_batch_size: 16
seed: 13
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine_with_restarts
lr_scheduler_warmup_steps: 500
num_epochs: 7
mixed_precision_training: Native AMP

Framework versions

Transformers 4.16.0.dev0
Pytorch 1.10.0+cu102
Datasets 1.17.1.dev0
Tokenizers 0.10.3

Evaluation Commands

To evaluate on mozilla-foundation/common_voice_8_0 with split test

python eval.py --model_id sammy786/wav2vec2-xlsr-czech --dataset mozilla-foundation/common_voice_8_0 --config cs --split test