rohitp1
/

dgx2_w2v2_large_distill_noisy_teacher_mozilla_epochs_50_batch_16

Automatic Speech Recognition

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

dgx2_w2v2_large_distill_noisy_teacher_mozilla_epochs_50_batch_16

This model is a fine-tuned version of rohitp1/kkkh_w2v2_large_finetune_teacher_babble_noise_mozilla_50_epochs_batch_16 on the None dataset. It achieves the following results on the evaluation set:

Loss: 21652.1836
Wer: 0.2592

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 256
total_train_batch_size: 4096
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.2
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
74637.3574	7.31	250	4331.1958	0.2791
75858.376	14.63	500	7166.9727	0.2759
76494.272	21.94	750	9417.4209	0.2713
76375.128	29.26	1000	13408.2549	0.2680
74149.512	36.57	1250	14529.0449	0.2657
73472.352	43.89	1500	14684.6582	0.2643
72301.832	51.2	1750	15828.4707	0.2634
71340.256	58.51	2000	17094.2773	0.2614
71890.376	65.83	2250	17973.5566	0.2604
71789.656	73.14	2500	19330.4316	0.2599
71579.512	80.46	2750	19927.2129	0.2599
71862.48	87.77	3000	21301.7754	0.2592
71131.112	95.09	3250	21652.1836	0.2592

Framework versions

Transformers 4.29.2
Pytorch 2.0.0+cu117
Datasets 2.8.0
Tokenizers 0.13.2

Downloads last month: 1

Inference Providers NEW

Automatic Speech Recognition

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Evaluation results

Metadata error: specify a dataset to view leaderboard