text-data-filtering / en_examples_with_stats.json

Commit History

7 languages supported
ea01f38

HugoLaurencon HF staff commited on

new kenlm models
b37a555

HugoLaurencon HF staff commited on

add register information
061d2e4

HugoLaurencon HF staff commited on

new filter on word repetition ratio
4809033

HugoLaurencon HF staff commited on

distributions for the filters on words and discarded words by filter
da13b29

HugoLaurencon HF staff commited on

rename badwords to flagged words + new flagged words list of 68 words
f217a73

HugoLaurencon HF staff commited on

filter on repetition removal
693f997

HugoLaurencon HF staff commited on

Delete en_examples_with_stats.json
0376199

HugoLaurencon HF staff commited on

faster visu (less documents)
07c617e

HugoLaurencon HF staff commited on

update TVN #2
1fed88b

teven commited on

Upload en_examples_with_stats.json with git-lfs
21fafec

HugoLaurencon HF staff commited on