Kenlm language model

#1
by GaetanBaert - opened

Hello,
Which dataset did you use to train the Kenlm model ?
Also, what parameters did you use ?

Hello @GaetanBaert ! To build the LM, I've used the data from Common Voice 8.0, MediaSpeech, Multilingual TEDx, Multilingual LibriSpeech, and Voxpopuli. I've transformed all the text to lowercase and removed the punctuation.

Sign up or log in to comment