YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Whisper Large v3 Turbo (Albanian Fine-Tuned) - v2

This is a fine-tuned version of the Whisper Large v3 Turbo model, optimized for Albanian speech-to-text transcription. It achieves a Word Error Rate (WER) of 6.98% on a held-out evaluation set.

Model Details

  • Base Model: openai/whisper-large-v3-turbo
  • Language: Albanian (sq)

Training Dataset

  • Source: Mozilla Common Voice version 19 (available in HF as Kushtrim/common_voice_19_sq)
  • Description: Audio clips ranging from 5-30 seconds, in spoken Albanian.

Training Details

The model was fine-tuned on an NVIDIA A100 GPU (40GB) using the transformers library. Below are the key training arguments:

Argument Value Description
per_device_train_batch_size 8 Training batch size per GPU
per_device_eval_batch_size 2 Evaluation batch size per GPU
gradient_accumulation_steps 1 Steps to accumulate gradients (effective batch size = 8)
num_train_epochs 3 Number of training epochs
learning_rate 1e-5 Initial learning rate
warmup_steps 300 Number of warmup steps for learning rate
evaluation_strategy "steps" Evaluate every eval_steps during training
eval_steps 250 Frequency of evaluation (every 250 steps)
fp16 True Use mixed precision training (16-bit floats)
  • Total Steps: ~3,540 (completed 3,500)
  • Hardware: NVIDIA A100 (40GB)
  • Libraries:
    • transformers==4.38.2
    • torch==2.2.1

Performance

Step Training Loss Validation Loss WER
250 0.4744 0.3991 34.03%
500 0.3421 0.3426 30.42%
750 0.2871 0.2808 26.09%
1000 0.2401 0.2258 21.31%
1250 0.1809 0.1998 19.15%
1500 0.1142 0.1827 17.33%
1750 0.1051 0.1611 15.19%
2000 0.0930 0.1464 13.82%
2250 0.0827 0.1313 11.79%
2500 0.0420 0.1139 10.50%
2750 0.0330 0.1124 9.58%
3000 0.0255 0.1006 8.38%
3250 0.0256 0.0905 7.48%
3500 0.0204 0.0889 6.98%
  • Final WER: 6.98% (at step 3500)
Downloads last month
21
Safetensors
Model size
809M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using Flutra/whisper-large-v3-turbo-sq-v2 1