Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Spaces:
huggingface
/
text-data-filtering
like
33
Runtime error
App
Files
Files
Community
4
0610f9d
text-data-filtering
6 contributors
History:
53 commits
HugoLaurencon
HF staff
visualization: choose between several languages
0610f9d
over 2 years ago
.gitattributes
1.39 kB
chinese visu
over 2 years ago
.gitignore
25 Bytes
new tool to analyse our own doc
over 2 years ago
LICENSE
11.4 kB
Create LICENSE
over 2 years ago
README.md
909 Bytes
initial commit
over 2 years ago
app.py
35.3 kB
visualization: choose between several languages
over 2 years ago
en.arpa.bin
4.4 GB
LFS
test
over 2 years ago
en.sp.model
1.39 MB
LFS
test
over 2 years ago
en_examples_with_stats.json
241 MB
LFS
distributions for the filters on words and discarded words by filter
over 2 years ago
explanation_filtering_pipeline.pdf
216 kB
rename badwords to flagged words + new flagged words list of 68 words
over 2 years ago
filtering.py
30.6 kB
rename badwords to flagged words + new flagged words list of 68 words
over 2 years ago
flagged_words.py
51.8 kB
rename badwords to flagged words + new flagged words list of 68 words
over 2 years ago
languages_id.py
5.61 kB
rename badwords to flagged words + new flagged words list of 68 words
over 2 years ago
lid.176.bin
131 MB
LFS
test
over 2 years ago
normalization.py
941 Bytes
test
over 2 years ago
parameters_filtering.py
31.7 kB
rename badwords to flagged words + new flagged words list of 68 words
over 2 years ago
requirements.txt
76 Bytes
fix requirements
over 2 years ago
stopwords.py
99.2 kB
test
over 2 years ago
zh.arpa.bin
3.39 GB
LFS
visualization: choose between several languages
over 2 years ago
zh.sp.model
1.37 MB
LFS
visualization: choose between several languages
over 2 years ago
zh_examples_with_stats.json
62.9 MB
LFS
visualization: choose between several languages
over 2 years ago