whisper small finetuned speed augmentation TLT non-native child speech

This model is a fine-tuned version of openai/whisper-small on the LTL2021 dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2048
training_steps: 4000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
5.2971	1.6087	500	2.9974	17.8041
3.2025	3.2158	1000	1.8702	17.0941
1.7907	4.8245	1500	1.0121	18.1471
0.8310	6.4316	2000	0.4578	18.5425
0.6474	8.0386	2500	0.4194	18.9138
0.6284	9.6473	3000	0.4095	19.6063
0.6056	11.2544	3500	0.4054	20.4168
0.5855	12.8631	4000	0.4043	19.4862

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

this model