Commit History

Update requirements.txt
1fcbb8d

HugoLaurencon HF staff commited on

Delete explanation_filtering_pipeline.pdf
3ae24a3

HugoLaurencon HF staff commited on

Update app.py
b881ada

HugoLaurencon HF staff commited on

Upload explanation_filtering_pipeline.pdf
2be3583

HugoLaurencon HF staff commited on

Delete explanation_filtering_pipeline.pdf
3327a22

HugoLaurencon HF staff commited on

remove arabic and viet models
836c1b3

HugoLaurencon HF staff commited on

Add Portuguese
ba331e0

HugoLaurencon HF staff commited on

delete unused models
34eca0f

HugoLaurencon HF staff commited on

Merge branch 'main' of https://huggingface.co/spaces/huggingface/text-data-filtering
f6058aa

HugoLaurencon HF staff commited on

back to before portuguese
091dbe4

HugoLaurencon HF staff commited on

update visu for Portuguese
2b811ac

HugoLaurencon HF staff commited on

7 languages supported
ea01f38

HugoLaurencon HF staff commited on

new kenlm models
b37a555

HugoLaurencon HF staff commited on

add register information
061d2e4

HugoLaurencon HF staff commited on

new filter on word repetition ratio
4809033

HugoLaurencon HF staff commited on

visualization: small step for the slider on flagged words ratio
fa81556

HugoLaurencon HF staff commited on

visualization: choose between several languages
0610f9d

HugoLaurencon HF staff commited on

distributions for the filters on words and discarded words by filter
da13b29

HugoLaurencon HF staff commited on

visualization: upload our own stop words and flagged words list
5d56c36

HugoLaurencon HF staff commited on

everything in expanders
2c2527f

HugoLaurencon HF staff commited on

display distributions in sidebar and filtering parameters in expanders
5d485e5

HugoLaurencon HF staff commited on

rename badwords to flagged words + new flagged words list of 68 words
f217a73

HugoLaurencon HF staff commited on

button to download parameters
bfbcd60

HugoLaurencon HF staff commited on

add warning message
649ea6a

HugoLaurencon HF staff commited on

better visualization
8f0da78

HugoLaurencon HF staff commited on

fix division by 0 in compute_special_characters_ratio
b607b76

HugoLaurencon HF staff commited on

new tool to analyse our own doc
6f25c5c

HugoLaurencon HF staff commited on

fix requirements
d463071

HugoLaurencon HF staff commited on

correction of bug
22701ae

HugoLaurencon HF staff commited on

filter on repetition removal
693f997

HugoLaurencon HF staff commited on

Update app.py
189d6aa

HugoLaurencon HF staff commited on

Delete en_examples_with_stats_no_small_docs.json
58d483d

HugoLaurencon HF staff commited on

Delete en_examples_with_stats_ldnoob.json
b190ef8

HugoLaurencon HF staff commited on

Delete en_examples_with_stats.json
0376199

HugoLaurencon HF staff commited on

remove zipf's law and update of the doc
3fd19c1

HugoLaurencon HF staff commited on

visu with discarded documents by filter
14574d7

HugoLaurencon HF staff commited on

faster visu (less documents)
07c617e

HugoLaurencon HF staff commited on