AqeelShafy7
/

Whisper-Sinhala_Audio_to_Text

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Whisper-Sinhala_Audio_to_Text

This model is a fine-tuned version of openai/whisper-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.9038
Wer: 50.0822

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.0665	4.76	1000	0.5398	57.8125
0.0096	9.52	2000	0.6716	56.2089
0.0037	14.29	3000	0.7457	52.7549
0.0005	19.05	4000	0.8000	51.1513
0.002	23.81	5000	0.8057	51.6859
0.0005	28.57	6000	0.8150	50.3289
0.0005	33.33	7000	0.8445	51.0280
0.0	38.1	8000	0.8773	50.1234
0.0	42.86	9000	0.8944	50.1234
0.0	47.62	10000	0.9038	50.0822

Framework versions

Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

Downloads last month: 405

Safetensors

Model size

242M params

Tensor type

F32

·

Inference Providers NEW

Automatic Speech Recognition

This model is not currently available via any of the supported Inference Providers.

Model tree for AqeelShafy7/Whisper-Sinhala_Audio_to_Text

Base model

openai/whisper-small

Finetuned

(2399)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard