Hugging Face
Models
Datasets
Pricing
Resources
Website
Metrics
Languages
Organizations
Community
Forum
Blog
GitHub
Documentation
Model Hub doc
Inference API doc
Transformers doc
Tokenizers doc
Datasets doc
Log In
Sign Up
Account
Log In
Sign Up
Website
Models
Datasets
Metrics
Languages
Organizations
Pricing
Community
Forum
Blog
Documentation
Model Hub doc
Inference API doc
Transformers doc
Tokenizers doc
Datasets doc
Back to tag list
Tasks
Clear
Fill-Mask
Question Answering
Summarization
Table Question Answering
Text Classification
Text Generation
Text2Text Generation
Token Classification
Translation
Zero-Shot Classification
Text-to-Speech
Automatic Speech Recognition
Audio Source Separation
Voice Activity Detection
+ 4
Back to tag list
Libraries
Clear All
PyTorch
TensorFlow
Rust
Flair
Asteroid
TF SavedModel
ESPnet
TF Lite
Pyannote
ONNX
Timm
+ 9
Back to tag list
Datasets
Clear All
wikipedia
squad
c4
bookcorpus
dcep europarl jrc-acquis
CLUECorpusSmall
oscar
squad_v2
cnn_dailymail
conll2003
jrc-acquis
PropBank.Br
xsum
librispeech_asr
OSIAN
1.5B Arabic Corpus
gigaword
parsinlu
ontonotes
OSCAR Arabic Unshuffled
natural_questions
multi_nli
Indo4B
wikisql
CoNLL-2012
lince
code_search_net
wmt19
OPUS
sep_clean
blended_skill_talk
wtq
OpenLegalData
imdb
tab_fact
msr_sqa
ai-soco
mc4
fever
common_crawl
mnli
xtreme
enh_single
sep_noisy
openwebtext
snli
flaubert
piaf
DAGW
biomedical literature from Scielo and Pubmed
quoref
docred
gap
winograd_wsc
winogender
glue
arabic_billion_words
open_subtitles
twitter
SAIL 2017
squad2
ag_news
trivia_qa
librispeech
Libri1Mix
Libri2Mix
Libri3Mix
web_questions
brWaC
BFD
wiki_dpr
emotion
FQuAD
SQuAD-FR
PubMed
id_newspapers_2018
array of dataset identifiers
reddit singapore, malaysia
hardwarezone
id_liputan6
wmt16
opus100
xsum_nl
ComVE
wham
Universal Dependencies
anli
go_emotions
xnli
STSbenchmark
MIMIC-III
Wikipedia
scientific_papers
dindebat.dk
hestenettet.dk
danish OpenSubtitles
wikipedia-turkish
mlqa
fquad
Indonesian Wikipedia
common_gen
MS MARCO document ranking
indosum
germeval_14
race
wmt14
MNLI
NQ
Trivia
SQuAD
MLQA
DRCD
muchocine
Twitter
IndianPolitics
scancode-rules
imagenet
trec
conll2000
dihard
jsut
ljspeech
break_data
sst-2
multi_nli_mismatch
coqa
Shuffled Dutch section of the OSCAR corpus (https://oscar-corpus.com/)
Marefa-NER
yahoo-answers
squad_v1
Uniref100
msmarco
eli5
AI4Bharat IndicNLP Corpora
220M words (IndoWiki, IndoWC, News)
nadi
SciDocs
100GB Chinese corpus
CommonCrawl
emo
pubmed
Arabic poetry from several eras
triviaqa
Wikihow
tydiqa
tweets_hate_speech_detection
quora
RuSentiment
discofuse
ai2_arc
openbookqa
mlsum
Arabic Wikipedia
mulit_nli
custom-book-corpus
Spotify Podcasts Dataset
quartz
legal entity recognition
Wikipedia (Hindi, Sanskrit, Gujarati)
The Pile
susumu2357/squad_v2_sv
https://github.com/staeiou/arxiv_archive/tree/v1.0.1
marefa-mt
arcd
custom danish dataset
RuTweetCorp
RuReviews
Squad
XQuad
Tydiqa
codexglue
CC-aligned
https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation
sms_spam
yelp_polarity
Icelandic portion of the OSCAR corpus from INRIA
sail
Oscar Corpus, News, Stories
pytorrent
socian
bangla-sentiment-benchmark
augmented_codesearchnet
arxiv_dataset
JW300
commonvoice
CC100
cc100
coco
Tesserae
Phi5
Thomas Aquinas
algebra_linear_1d
algebra_linear_1d_composed
measurement_time
numbers_gcd
kowiki
news
DaNE
legal
DAMP-VSEP
voxceleb
tatoeba
setimes
csmsc
sqa
libri1mix
bible_para
event2Mind
qasc
quarel
quotes-500K
bioASQ
TQUAD
TACDataset
urdu-text-news
+ 207
Back to tag list
Languages
Clear All
en
es
fr
sv
fi
de
multilingual
zh
ru
ar
it
uk
id
eo
fa
nl
pt
tr
pl
da
bg
ja
he
hi
no
af
el
cs
ca
Chinese
ro
is
ms
hu
et
ko
lt
vi
ht
sl
tl
hr
bn
gl
mt
gu
mk
ig
ur
sg
lv
mr
ny
rw
sn
xh
ee
ts
ln
lg
yo
si
rn
eu
be
as
or
sm
ty
to
nso
fy
ha
lb
sq
yi
nb
fj
nn
niu
crs
bcl
guw
tn
gaa
co
wa
ceb
ga
st
te
mh
fo
ilo
pag
pon
efi
iso
pis
bzs
pap
gil
lua
cy
rm
oc
an
am
hy
sk
zu
ti
lus
kg
swc
tll
tvl
loz
th
gv
bi
hil
bem
lu
tw
lue
ase
war
gd
ml
so
os
ps
se
kqn
toi
srn
jv
ka
km
kn
mg
mn
ta
wo
br
kw
run
tiv
ho
tpi
wls
zne
az
ber
kwy
chk
mfe
rnd
tum
sc
ve
yap
c++
ne
mi
tk
tt
mos
sh
my
roa
protein
code
lo
sw
sa
cv
ba
na
english
yue
cel
sla
bo
bs
pa
sr
su
kl
io
ce
ab
ISO 639-1 code for your language, or `multilingual`
dv
English
fiu
afa
cpp
sem
iir
gem
umb
inc
zle
dutch
gmq
zlw
ine
zls
urj
gmw
itc
kk
jap
la
eng
trk
ch
gn
nv
mul
grk
cnr
hbs
py
dra
kj
bat
aav
luo
kwn
om
bnt
ng
ss
cus
nyk
cpf
lun
euq
alv
mkh
sal
phi
csg
pqe
fse
csn
nic
aed
mfs
prl
tzo
zai
Cszech
Deustch
French
Swedish
vn
haw
hmn
ku
ky
tg
ug
uz
italian
ks
sd
scientific english
pt-br
[en]
hi-en
scandinavia
Deustch English
zh-tw
kab
cau
poz
tdt
ssp
zul
art
ccs
map
pqw
sit
tut
yua
sami
taw
vsl
wal
ach
Cszech Deustch
Cszech English
Cszech Spanish
Cszech French
Cszech Italian
Cszech Swedish
Deustch Cszech
Deustch Spanish
Deustch French
Deustch Italian
Deustch Swedish
English Cszech
English Deustch
French Cszech
French Deustch
French English
French Spanish
French Italian
French Swedish
Italian Cszech
Italian Deustch
Italian English
Italian Spanish
Italian French
Italian Swedish
Swedish Cszech
Swedish Deustch
Swedish English
Swedish Spanish
Swedish French
Swedish Italian
Go
Java
javascript
php
python
en de nl es
nr
sot
ven
xho
esperanto
+ 329
Back to tag list
Licenses
Clear All
apache-2.0
mit
gpl-3.0
cc-by-4.0
cc-by-sa-3.0
apache 2.0
cc by-nc-sa 4.0
cc-by-nc-4.0
public domain notice
cc-by-sa-4.0
any valid license identifier
cc-by 4.0
apache license 2.0
apache
gnu gplv3
+ 13
Models
4
Sort:
Most Downloads
Most Downloads
Alphabetical
Recently Updated
flaubert/flaubert_base_cased
Fill-Mask
•
Updated
Dec 16, 2020
•
6,094
flaubert/flaubert_small_cased
Fill-Mask
•
Updated
Dec 16, 2020
•
1,592
flaubert/flaubert_base_uncased
Fill-Mask
•
Updated
Dec 16, 2020
•
1,022
flaubert/flaubert_large_cased
Fill-Mask
•
Updated
Dec 16, 2020
•
594