Whisper Large v3 Turbo TR - Selim Çavaş

This model is a fine-tuned version of openai/whisper-large-v3-turbo on the Common Voice 17.0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3123
  • Wer: 18.9229

Intended uses & limitations

This model can be used in various application areas, including

  • Transcription of Turkish language
  • Voice commands
  • Automatic subtitling for Turkish videos

How To Use

import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline

device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32

model_id = "selimc/whisper-large-v3-turbo-turkish"

model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)
model.to(device)

processor = AutoProcessor.from_pretrained(model_id)

pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    chunk_length_s=30,
    batch_size=16,
    return_timestamps=True,
    torch_dtype=torch_dtype,
    device=device,
)

result = pipe("test.mp3")
print(result["text"])

Training

Due to colab GPU constraints I was able to train using only the 25% of the Turkish data available in the Common Voice 17.0 dataset. 😔

Got a GPU to spare? Let's collaborate and take this model to the next level! 🚀

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.1223 1.6 1000 0.3187 24.4415
0.0501 3.2 2000 0.3123 20.9720
0.0226 4.8 3000 0.3010 19.6183
0.001 6.4 4000 0.3123 18.9229

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
166
Safetensors
Model size
809M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for selimc/whisper-large-v3-turbo-turkish

Finetuned
(129)
this model

Dataset used to train selimc/whisper-large-v3-turbo-turkish

Evaluation results