--- tags: - generated_from_keras_callback model-index: - name: wav2vec2-xls-r-300m-mixed results: [] --- # wav2vec2-xls-r-300m-mixed Finetuned https://huggingface.co/facebook/wav2vec2-xls-r-300m on https://github.com/huseinzol05/malaya-speech/tree/master/data/mixed-stt This model was finetuned on 3 languages, 1. Malay 2. Singlish 3. Mandarin **This model trained on a single RTX 3090 Ti 24GB VRAM, provided by https://mesolitica.com/**. ## Evaluation set Evaluation set from https://github.com/huseinzol05/malaya-speech/tree/master/pretrained-model/prepare-stt with sizes, ``` len(malay), len(singlish), len(mandarin) -> (765, 3579, 614) ``` It achieves the following results on the evaluation set based on [evaluate-wav2vec2-xls-r-300m-mixed.ipynb](evaluate-wav2vec2-xls-r-300m-mixed.ipynb): Mixed evaluation, ``` CER: 0.04363189219453221 WER: 0.12446419219809059 CER with LM: 0.03621180629932558 WER with LM: 0.09152993800218129 ``` Malay evaluation, ``` CER: 0.053659683623049854 WER: 0.22565751242221832 CER with LM: 0.036930421149001316 WER with LM: 0.14256712242006359 ``` Singlish evaluation, ``` CER: 0.04174804195104746 WER: 0.10734402150682842 CER with LM: 0.03538238462620066 WER with LM: 0.08103191123663189 ``` Mandarin evaluation, ``` CER: 0.04211892733885779 WER: 0.09817787449869257 CER with LM: 0.040151154521006656 WER with LM: 0.08913415903511501 ``` Language model from https://huggingface.co/huseinzol05/language-model-bahasa-manglish-combined