whisper-large-v3-myanmar

This model is a fine-tuned version of openai/whisper-large-v3 on the chuuhtetnaing/myanmar-speech-dataset-openslr-80 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1752
  • Wer: 54.8976

Usage

from datasets import Audio, load_dataset
from transformers import pipeline

# Load a sample audio
dataset = load_dataset("chuuhtetnaing/myanmar-speech-dataset-openslr-80")
dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
test_dataset = dataset['test']
input_speech = test_dataset[42]['audio']

pipe = pipeline(model='chuuhtetnaing/whisper-large-v3-myanmar')

output = pipe(input_speech, generate_kwargs={"language": "myanmar", "task": "transcribe"})
print(output['text']) # α€€α€»α€™ α€•α€Όα€Šα€Ία€• မှာ α€•α€Šα€¬α€žα€„α€Ί တော့ စာမေးပွဲ α€€α€­α€― တပတ်တခါ α€…α€…α€Ία€α€šα€Ί

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 20
  • eval_batch_size: 20
  • seed: 42
  • gradient_accumulation_steps: 3
  • total_train_batch_size: 60
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.9771 1.0 42 0.7598 100.0
0.3477 2.0 84 0.2140 89.8931
0.2244 3.0 126 0.1816 79.0294
0.1287 4.0 168 0.1510 71.9947
0.1029 5.0 210 0.1575 77.8718
0.0797 6.0 252 0.1315 70.5254
0.0511 7.0 294 0.1143 70.5699
0.03 8.0 336 0.1154 68.1656
0.0211 9.0 378 0.1289 69.1897
0.0151 10.0 420 0.1318 66.7854
0.0113 11.0 462 0.1478 69.1451
0.0079 12.0 504 0.1484 66.2066
0.0053 13.0 546 0.1389 65.0935
0.0031 14.0 588 0.1479 64.3811
0.0014 15.0 630 0.1611 64.8264
0.001 16.0 672 0.1627 63.3571
0.0012 17.0 714 0.1546 65.0045
0.0006 18.0 756 0.1566 64.5147
0.0006 20.0 760 0.1581 64.6928
0.0002 21.0 798 0.1621 63.9804
0.0003 22.0 836 0.1664 60.8638
0.0002 23.0 874 0.1663 58.5040
0.0 24.0 912 0.1699 55.8326
0.0 25.0 950 0.1715 55.0312
0.0 26.0 988 0.1730 54.9866
0.0 27.0 1026 0.1740 54.8976
0.0 28.0 1064 0.1747 54.8976
0.0 29.0 1102 0.1751 54.8976
0.0 30.0 1140 0.1752 54.8976

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.1+cu121
  • Datasets 2.14.5
  • Tokenizers 0.15.1
Downloads last month
23
Safetensors
Model size
1.54B params
Tensor type
F32
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for chuuhtetnaing/whisper-large-v3-myanmar

Finetuned
(360)
this model

Dataset used to train chuuhtetnaing/whisper-large-v3-myanmar

Space using chuuhtetnaing/whisper-large-v3-myanmar 1

Collection including chuuhtetnaing/whisper-large-v3-myanmar