Commit History

Try using a KenLM model trie binary version, if not found try using the hash table binary version
18e3201

edugp commited on

Sync with data tooling repo, using edugp/kenlm models, updating viz to use quantiles for coloring and ad-hoc viz for the registry dataset
3c30fa3

edugp commited on

Run tokenizer before computing perplexity and format
7b62017

edugp commited on

Replicate default cc_net preprocessing at inference time on KenlmModel.get_perplexity
0def03f

edugp commited on

Check if file exists before attempting to remove
ab7449f

edugp commited on

Remove also the '{language}.sp.model' file on failure
38b6530

edugp commited on

Remove corrupt KenLM model files.
9ec7b19

edugp commited on

Support visualizing both sentences and whole documents. Smooth down color assignment in visualization.
a86046b

edugp commited on