tugstugi's picture
Update README.md
9bbe0c8
|
raw
history blame
887 Bytes
metadata
language: xal
tags:
  - speech
  - audio
  - automatic-speech-recognition
license: apache-2.0

Info

This Wav2Vec2 model was first pretrained on 500 hours Kalmyk TV recordings and 1000 hours Mongolian speech recognition dataset. After that, the model was finetuned on a 300 hours Kalmyk synthetic STT dataset created by a voice conversion model.