model trained on english language corpus. #1

by spranjal25 - opened

Hi, is there a model in this directory which has been trained on English language corpus? also the names don't mention how many grams the model was built with. Any help is appreciated, Thanks!

Hi @spranjal25 ! That’s a great point - the models are all 5-gram. There are a few models trained on English. They are the one starting by “en”.
You can check the KenLM class in model.py for reference, for example, for the mode trained on English Wikipedia, you can use:

model = KenlmModel.from_pretrained("wikipedia", "en")