We present a CLSRIL-23 (Cross Lingual Speech Representations on Indic Languages), a self supervised learning based audio pre-trained model which learns cross lingual speech representations from raw audio across 23 Indic languages. It is built on top of wav2vec 2.0 which is solved by training a contrastive task over masked latent speech representations and jointly learns the quantization of latents shared across all languages.
Original Repo contains models in fairseq format.
|Language||Data (In Hrs)|
Experimentation platform built on top of fairseq.
- Downloads last month