polyglot-tagger
/

multilabel-language-identification

Text Classification

Generated from Trainer

language-identification

text-embeddings-inference

Model card Files Files and versions

Metrics Training metrics Community

DerivedFunction1 commited on 23 days ago

Commit

163392a

·

verified ·

1 Parent(s): 78e91e0

Update README.md

Files changed (1) hide show

README.md +116 -2

README.md CHANGED Viewed

@@ -4,17 +4,131 @@ license: mit
 base_model: xlm-roberta-base
 tags:
 - generated_from_trainer
 metrics:
 - precision
 - recall
 - f1
 - accuracy
 model-index:
-- name: xlmr-language-identification
   results: []
 ---
 # Polyglot Tagger: Multi-label Language Identification
 Refer to `polyglot-tagger/language-identification`. It is trained on the same dataset as a text-classifier rather than as a token classifier.

 base_model: xlm-roberta-base
 tags:
 - generated_from_trainer
+- language-identification
 metrics:
 - precision
 - recall
 - f1
 - accuracy
+language:
+- multilingual
+- af
+- am
+- ar
+- as
+- ba
+- be
+- bg
+- bn
+- bo
+- br
+- bs
+- ca
+- ce
+- ckb
+- cs
+- cy
+- da
+- de
+- dv
+- el
+- en
+- eo
+- es
+- et
+- eu
+- fa
+- fi
+- fr
+- ga
+- gd
+- gl
+- gu
+- he
+- hi
+- hr
+- hu
+- hy
+- id
+- is
+- it
+- ja
+- jv
+- ka
+- kk
+- km
+- kn
+- ko
+- ku
+- ky
+- la
+- lb
+- lo
+- lt
+- lv
+- mg
+- mk
+- ml
+- mn
+- mr
+- ms
+- mt
+- my
+- ne
+- nl
+- 'no'
+- ny
+- oc
+- om
+- or
+- pa
+- pl
+- ps
+- pt
+- rm
+- ro
+- ru
+- sd
+- si
+- sk
+- sl
+- so
+- sq
+- sr
+- su
+- sv
+- sw
+- ta
+- te
+- tg
+- th
+- ti
+- tl
+- tr
+- tt
+- ug
+- uk
+- ur
+- uz
+- vi
+- yo
+- yi
+- zh
+- zu
 model-index:
+- name: polyglot-tagger
   results: []
+datasets:
+- wikimedia/wikipedia
+- HuggingFaceFW/finetranslations
+- google/smol
+- polyglot-tagger/nlp-noise-snippets
+- polyglot-tagger/wikipedia-language-snippets-filtered
+- polyglot-tagger/finetranslations-filtered
+- polyglot-tagger/tatoeba-filtered
+pipeline_tag: text-classification
 ---
 # Polyglot Tagger: Multi-label Language Identification
 Refer to `polyglot-tagger/language-identification`. It is trained on the same dataset as a text-classifier rather than as a token classifier.