Hugging Face
Models
Datasets
Pricing
Resources
Website
Metrics
Languages
Organizations
Community
Forum
Blog
GitHub
Documentation
Model Hub doc
Inference API doc
Transformers doc
Tokenizers doc
Datasets doc
Accelerate doc
We're hiring!
Log In
Sign Up
Account
Log In
Sign Up
Website
Models
Datasets
Metrics
Languages
Organizations
Pricing
Community
Forum
Blog
Documentation
Model Hub doc
Inference API doc
Transformers doc
Tokenizers doc
Datasets doc
Accelerate doc
Back to tag list
Tasks
Clear
Fill-Mask
Question Answering
Summarization
Table Question Answering
Text Classification
Text Generation
Text2Text Generation
Token Classification
Translation
Zero-Shot Classification
Conversational
Text-to-Speech
Automatic Speech Recognition
Audio Source Separation
Voice Activity Detection
+ 5
Back to tag list
Libraries
Clear All
PyTorch
TensorFlow
Rust
Flair
Asteroid
TF SavedModel
TF Lite
ESPnet
Pyannote
Timm
ONNX
+ 9
Back to tag list
Datasets
Clear All
common_voice
wikipedia
dcep europarl jrc-acquis
squad
bookcorpus
c4
CLUECorpusSmall
parsinlu
oscar
squad_v2
cnn_dailymail
imagenet
conll2003
librispeech_asr
xsum
PropBank.Br
jrc-acquis
OSIAN
1.5B Arabic Corpus
gigaword
natural_questions
imagenet-21k
ontonotes
multi_nli
OSCAR Arabic Unshuffled
brWaC
wikisql
twitter
mustc
openslr
CoNLL-2012
lince
Indo4B
arabic_billion_words
open_subtitles
covost2
snli
OPUS
code_search_net
wmt19
msr_sqa
librispeech
sep_clean
mnli
xnli
blended_skill_talk
OpenLegalData
wtq
tab_fact
enh_single
ai-soco
mc4
common_crawl
xtreme
fever
trivia_qa
samsum
imdb
race
Libri1Mix
sep_noisy
openwebtext
flaubert
cc100
emotion
piaf
DAGW
PubMed
quoref
docred
gap
winograd_wsc
winogender
glue
ms_marco
OpenSLR
ag_news
SAIL 2017
biomedical literature from Scielo and Pubmed
squad2
openbookqa
w11wo/imdb-javanese
Libri2Mix
Libri3Mix
web_questions
Farasa
wiki_dpr
cc_news
FQuAD
SQuAD-FR
id_liputan6
reddit singapore, malaysia
hardwarezone
interspeech_2021_asr
array of dataset identifiers
opus100
jsut
wmt16
ComVE
voxceleb
dihard
wham
Universal Dependencies
commonsenseqa
arc
qqp
the Pile
id_newspapers_2018
BFD
dindebat.dk
hestenettet.dk
danish OpenSubtitles
STSbenchmark
AI4Bharat IndicNLP Corpora
anli
mlqa
MIMIC-III
go_emotions
Wikipedia
tydiqa
pubmed
fquad
common_gen
arabic_speech_corpus
scientific_papers
NST Swedish ASR Database
NQ
Trivia
SQuAD
MLQA
DRCD
arcd
MS MARCO document ranking
Indonesian Wikipedia
wikipedia-turkish
germeval_14
wer
mlsum - es
MNLI
TQUAD
mulit_nli
wmt14
muchocine
NSC2018
timit_asr
JW300 + [Menyo-20k](https://huggingface.co/datasets/menyo20k_mt)
abhishek/autonlp-data-japanese-sentiment
EMBO/sd-panels
CSS10
https://arabicspeech.org/
sts
scancode-rules
trec
Twitter
IndianPolitics
conll2000
ljspeech
break_data
indosum
sst-2
Shuffled Dutch section of the OSCAR corpus (https://oscar-corpus.com/)
msmarco
SciDocs
yahoo-answers
Uniref100
Marefa-NER
squad_v1
nadi
220M words (IndoWiki, IndoWC, News)
parlament_parla
multi_nli_mismatch
coqa
CommonCrawl
eli5
triviaqa
mlsum
Jean-Baptiste/wikiner_fr
Wikihow
Arabic poetry from several eras
Yves/fhnw_swiss_parliament
discofuse
xquad
Interspeech 2021
webqa
dureader
xsum_nl
tweets_hate_speech_detection
UniRef50
susumu2357/squad_v2_sv
bioASQ
HARD-Arabic-Dataset
yelp_polarity
100GB Chinese corpus
quora
sail
emo
movies
cord19
vivos
wiki-mk
time-mk-news-2010-2015
sms_spam
legal entity recognition
arxiv_dataset
cifar10
quartz
Squad
XQuad
Tydiqa
iapp-wiki-qa-dataset
XQuAD
Arabic Poetry Dataset (6th - 21st century)
Spotify Podcasts Dataset
MLSUM
created a new dataset based on https://www.openslr.org/92/
RuSentiment
openslr_hindi
common_voice, infore_25h
ai2_arc
Indic TTS Malayalam Speech Corpus
Openslr Malayalam Speech Corpus
SMC Malayalam Speech Corpus
IIIT-H Indic Speech Databases
Wikipedia (Hindi, Sanskrit, Gujarati)
google_wellformed_query
Arabic Wikipedia
shemo
squad_v1_pt
custom-book-corpus
quotes-500K
masakhaner
mgb5
indic tts
iiith
LJSpeech
LibriTTS
common_voice mn
socian
bangla-sentiment-benchmark
bible_para
XSUM
Gigaword
ALFFA,Gamayun & IWSLT
google
codexglue
EMBO/sd-nlp
L3CubeMahaSent
marefa-mt
BembaSpeech
RuTweetCorp
Oscar Corpus, News, Stories
https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation
Finnish parliament session 2
EMBO/biolang
augmented_codesearchnet
pytorrent
JW300
CC100
kazakh_speech_corpus
custom danish dataset
coco
RuReviews
fon_dataset
https://github.com/staeiou/arxiv_archive/tree/v1.0.1
Tesserae
Phi5
Thomas Aquinas
algebra_linear_1d
algebra_linear_1d_composed
measurement_time
numbers_gcd
kowiki
news
OpenSLR 77
DaNE
legal
DAMP-VSEP
tatoeba
setimes
csmsc
SUC 3.0
sqa
malromur
libri1mix
event2Mind
qasc
quarel
CC-aligned
Icelandic portion of the OSCAR corpus from INRIA
TACDataset
ami
voxconverse
urdu-text-news
Voicebank
DEMAND
WHAM!
WHAMR!
WSJ0-2Mix
WSJ0-3Mix
Timers and Such
wikimovies
imagenet_21k
trtd56/autonlp-data-wrime_joy_only
NST Estonian ASR Database
+ 296
Back to tag list
Languages
Clear All
en
es
fr
sv
de
fi
multilingual
zh
ar
ru
it
fa
id
tr
pt
nl
uk
ja
eo
pl
da
Chinese
bg
ro
he
el
hi
ca
et
lt
no
cs
af
vi
hu
is
sl
ms
ko
ht
hr
mr
tl
bn
mt
lv
gl
gu
mk
rw
sg
eu
ig
ur
lg
ny
or
sn
xh
yo
ee
ts
ln
as
si
mn
rn
be
ga
jv
sm
ta
th
ty
to
nso
fy
ha
lb
sq
te
yi
nn
nb
fj
bcl
crs
niu
guw
gaa
tn
co
wa
ceb
cy
ka
st
fo
br
mh
ilo
bzs
efi
iso
gil
pis
lua
pon
pap
pag
rm
oc
an
am
hy
sk
zu
ti
tw
kg
bem
lus
loz
swc
tvl
tll
ml
english
gv
bi
lu
ase
hil
war
lue
gd
km
kn
so
os
ps
se
kqn
srn
toi
mg
tt
wo
kw
tiv
ho
zne
wls
run
tpi
ne
az
cv
ber
sc
kwy
mfe
tum
chk
rnd
yap
ve
c++
mi
sw
tk
dv
Bengali
yue
mos
sh
roa
my
code
protein
ky
lo
su
na
ba
sa
pa-IN
fy-NL
kk
bo
bs
pa
sr
sv-SE
gem
sla
kl
io
ce
ab
ISO 639-1 code for your language, or `multilingual`
zls
luo
ine
fiu
gmq
zle
itc
gmw
afa
iir
umb
cel
cpp
inc
sem
urj
zlw
Cszech Deustch
Cszech English
Cszech Spanish
Cszech French
Cszech Italian
Cszech Swedish
Deustch English
Deustch Spanish
Deustch French
Deustch Italian
Deustch Swedish
English Cszech
English Deustch
English Italian
French Cszech
French English
French Spanish
French Italian
French Swedish
Italian Cszech
Italian Deustch
Italian English
Italian Spanish
Italian French
Italian Swedish
Swedish Cszech
Swedish Deustch
Swedish English
Swedish Spanish
Swedish French
Swedish Italian
sah
hsb
la
eng
ch
gn
nv
mul
jap
zh-tw
lzh
thai
trk
py
ga-IE
kj
bat
grk
phi
om
dra
ss
csg
euq
fse
ng
alv
kwn
csn
nyk
bnt
lun
aed
pqe
aav
cpf
cus
mkh
nic
sal
mfs
prl
tzo
zai
Cszech
Deustch
Swedish
English Spanish
English French
English Swedish
Spanish Cszech
Spanish Deustch
Spanish English
Spanish French
Spanish Italian
Spanish Swedish
Deustch Cszech
French Deustch
ia
rm-sursilv
cnr
hbs
haw
hmn
ku
tg
ug
uz
italian
vn
dutch
???
scientific english
ks
sd
hi-en
zh-HK
[en]
nah specifically ncj
amh
hau
ibo
kin
lug
pcm
swa
wol
yor
ary
scn
nap
Guj
scandinavia
python
ssp
arz
zh-CN
art
cau
ccs
map
poz
pqw
sit
tdt
tut
yua
sami
kab
taw
vsl
wal
ach
rm-vallader
fon
Go
Java
javascript
php
en de nl es
cnh
nr
sot
ven
xho
zul
esperanto
xal
+ 369
Back to tag list
Licenses
Clear All
apache-2.0
mit
gpl-3.0
cc-by-4.0
cc-by-sa-3.0
cc by-nc-sa 4.0
apache 2.0
cc-by-nc-4.0
cc-by-sa-4.0
public domain notice
cc-by 4.0
any valid license identifier
apache license 2.0
apache-2
apache
attribution-sharealike 4.0 international
gnu gplv3
cc0
+ 16
Models
1
Sort:
Most Downloads
Most Downloads
Alphabetical
Recently Updated
adilism/wav2vec2-large-xlsr-kazakh
Automatic Speech Recognition
•
Updated
22 days ago