Second fine-tuning try of wav2vec2-base
. Results are similar to the ones reported in https://huggingface.co/facebook/wav2vec2-base-100h.
Model was trained on librispeech-clean-train.100 with following hyper-parameters:
- 2 GPUs Titan RTX
- Total update steps 11000
- Batch size per GPU: 32 corresponding to a total batch size of ca. ~750 seconds
- Adam with linear decaying learning rate with 3000 warmup steps
- dynamic padding for batch
- fp16
- attention_mask was not used during training
Check: https://wandb.ai/patrickvonplaten/huggingface/runs/1yrpescx?workspace=user-patrickvonplaten
Result (WER) on Librispeech:
"clean" (% rel difference to results in paper) | "other" (% rel difference to results in paper) |
---|---|
6.2 (-1.6%) | 15.2 (-11.2%) |
- Downloads last month
- 46
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.