mesolitica/wav2vec2-xls-r-300m-mixed · Step by step on how to use language model KenLM with the model

Very simple actually,

install necessary libraries, I would like to choose pyctcdecode,

pip3 install pyctcdecode==0.1.0 pypi-kenlm==0.1.20210121

The version is very important, if you try to bump pyctcdecode above 0.1.0, steps below are no longer working.

Download language model,

wget https://huggingface.co/huseinzol05/language-model-bahasa-manglish-combined/resolve/main/model.klm

Read https://github.com/huseinzol05/malaya-speech/blob/master/pretrained-model/prepare-lm/build-lm-mixed-combined.ipynb how to create your own language model.

Load the model and language model,

from transformers import AutoModelForCTC
from pyctcdecode import build_ctcdecoder
import kenlm

kenlm_model = kenlm.Model('model.klm')
decoder = build_ctcdecoder(
    unique_vocab,
    kenlm_model,
    alpha=0.2,
    beta=1.0,
    ctc_token_idx=tokenizer.pad_token_id
)

model = AutoModelForCTC.from_pretrained(
    'mesolitica/wav2vec2-xls-r-300m-mixed',
)

o_pt = model(inputs)
o_pt = o_pt.logits.detach().cpu().numpy()
out = decoder.decode_beams(o_pt[0], prune_history=True)