Wav2Vec2-Base-Pretrain-Vietnamese
The base model is pre-trained on 16kHz sampled speech audio from 100h Vietnamese unlabelled data in VLSP dataset. When using the model make sure that your speech input is also sampled at 16Khz. Note that this model should be fine-tuned on a downstream task, like Vietnamese Automatic Speech Recognition. Facebook's Wav2Vec2 blog Paper
Usage
See this notebook for more information on how to fine-tune the English pre-trained model.
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.