NbAiLab
/

salmon-whisper-large-smj-lr5e-5

Automatic Speech Recognition

hf-asr-leaderboard

Inference Endpoints

Model card Files Files and versions Community

salmon-whisper-large-smj-lr5e-5

This model is a fine-tuned version of openai/whisper-large-v2 on the NbAiLab/salmon-asr-smj dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
lr_scheduler_type: linear
per_device_train_batch_size: 6
total_train_batch_size_per_node: 48
total_train_batch_size: 48
total_optimization_steps: 60,000
starting_optimization_step: 40,000
finishing_optimization_step: 100,000
num_train_dataset_workers: 32
num_hosts: 1
total_num_training_examples: 4,800,000
steps_per_epoch: 1169
num_beams: None
weight_decay: 0.01
adam_beta1: 0.9
adam_beta2: 0.98
adam_epsilon: 1e-06
dropout: True
bpe_dropout_probability: 0.2
activation_dropout_probability: 0.1

Training results

step	validation_loss	train_loss	validation_wer	validation_cer	validation_exact_wer	validation_exact_cer
0	4.2254	4.6413	112.7660	59.8700	108.1117	62.0594
10000	0.8720	0.3747	18.2181	5.2803	21.4096	5.6762
20000	1.1365	0.2741	15.2926	4.6304	18.0851	5.0588
30000	1.2561	0.2111	14.6277	4.0617	17.9521	4.5011
40000	33.1032	10.4733	100.0	100.0	100.0	98.0681
50000	3.0192	2.5972	100.7979	80.9301	101.3298	79.8447
60000	2.7909	2.0728	99.6011	79.8944	100.5319	78.8688

Framework versions

Transformers 4.35.0
Datasets 2.14.6
Tokenizers 0.14.1

Downloads last month: 4

Inference Providers NEW

Automatic Speech Recognition

This model is not currently available via any of the supported Inference Providers.

Model tree for NbAiLab/salmon-whisper-large-smj-lr5e-5

Base model

openai/whisper-large-v2

Finetuned

(186)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard