File size: 4,514 Bytes

---
language:
- el
license: apache-2.0
tags:
- whisper-event
- generated_from_trainer
- whisper-large
- mozilla-foundation/common_voice_11_0
- greek
datasets:
- mozilla-foundation/common_voice_11_0
- google/fleurs
metrics:
- wer
model-index:
- name: whisper-lg-el-intlv-xs-2
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: mozilla-foundation/common_voice_11_0 el
      type: mozilla-foundation/common_voice_11_0
      config: el
      split: test
    metrics:
    - name: Wer
      type: wer
      value: 9.50037147102526
---

# whisper-lg-el-intlv-xs-2

This model is a fine-tuned version of [farsipal/whisper-lg-el-intlv-xs](https://huggingface.co/farsipal/whisper-lg-el-intlv-xs) on the mozilla-foundation/common_voice_11_0,google/fleurs el,el_gr dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2872
- Wer: 9.5004

## Model description

The model was trained on two interleaved datasets for transcription in the Greek language.

## Intended uses & limitations

Transcription in the Greek language

## Training and evaluation data

Training was performed on two interleaved datasets. Testing was performed on common voice 11.0 (el) test only.

## Training procedure
```
                --model_name_or_path   'farsipal/whisper-lg-el-intlv-xs' \
                --model_revision   main \
                --do_train   True \
                --do_eval   True \
                --use_auth_token   False \
                --freeze_feature_encoder   False \
                --freeze_encoder   False \
                --model_index_name   'whisper-lg-el-intlv-xs-2' \
                --dataset_name 'mozilla-foundation/common_voice_11_0,google/fleurs' \
                --dataset_config_name 'el,el_gr' \
                --train_split_name  'train+validation,train+validation' \
                --eval_split_name   'test,-' \
                --text_column_name  'sentence,transcription' \
                --audio_column_name 'audio,audio' \
                --streaming   False \
                --max_duration_in_seconds   30 \
                --do_lower_case   False \
                --do_remove_punctuation   False \
                --do_normalize_eval   True \
                --language   greek \
                --task transcribe \
                --shuffle_buffer_size   500 \
                --output_dir   './data/finetuningRuns/whisper-lg-el-intlv-xs-2' \
                --overwrite_output_dir   True \
                --per_device_train_batch_size   8 \
                --gradient_accumulation_steps  4 \
                --learning_rate   3.5e-6 \
                --dropout         0.15 \
                --attention_dropout 0.05 \
                --warmup_steps   500 \
                --max_steps   5000 \
                --eval_steps   1000 \
                --gradient_checkpointing   True \
                --cache_dir   '~/.cache' \
                --fp16   True \
                --evaluation_strategy   steps \
                --per_device_eval_batch_size   8 \
                --predict_with_generate   True \
                --generation_max_length   225 \
                --save_steps   1000 \
                --logging_steps   25 \
                --report_to   tensorboard \
                --load_best_model_at_end   True \
                --metric_for_best_model   wer \
                --greater_is_better   False \
                --push_to_hub   False  \
                --dataloader_num_workers 6
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3.5e-06
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- training_steps: 5000
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step | Validation Loss | Wer     |
|:-------------:|:-----:|:----:|:---------------:|:-------:|
| 0.0813        | 2.49  | 1000 | 0.2147          | 10.8284 |
| 0.0379        | 4.98  | 2000 | 0.2439          | 10.0111 |
| 0.0195        | 7.46  | 3000 | 0.2767          | 9.8811  |
| 0.0126        | 9.95  | 4000 | 0.2872          | 9.5004  |
| 0.0103        | 12.44 | 5000 | 0.3021          | 9.6954  |


### Framework versions

- Transformers 4.26.0.dev0
- Pytorch 1.13.0+cu117
- Datasets 2.8.1.dev0
- Tokenizers 0.13.2