Kenlm language model dataset

#1
by GaetanBaert - opened

Hello,
Which dataset did you use to train the KenLM model ?

Also, what parameters did you use ?

Hi,

Only trainsplit of the mozilla-foundation/common_voice_9_0 dataset has been used, no external text data.

No specific parameters, just bin/lmplz -o 5 <text >text.arpa

bofenghuang changed discussion status to closed

Sign up or log in to comment