|
--- |
|
language: en |
|
datasets: |
|
- librispeech_asr |
|
tags: |
|
- speech |
|
- audio |
|
- automatic-speech-recognition |
|
license: apache-2.0 |
|
--- |
|
|
|
|
|
This model is a distilled version of the wav2vec2 model (https://arxiv.org/pdf/2006.11477.pdf). This model is 4 times smaller and 3 times faster than the original wav2vec2 large model. |
|
|
|
|
|
When used with a light tri-gram language model head, this model achieves the following results : |
|
| Dataset | WER | |
|
| ------------- |:-------------:| |
|
| Librispeech-clean| 12.7%| |
|
|
|
|
|
notebook (google colab) at https://github.com/OthmaneJ/distil-wav2vec2 |
|
|