Hugging Face
Models
Datasets
Pricing
Resources
Website
Metrics
Languages
Organizations
Community
Forum
Blog
GitHub
Documentation
Model Hub doc
Inference API doc
Transformers doc
Tokenizers doc
Datasets doc
We're hiring!
Log In
Sign Up
Account
Log In
Sign Up
Website
Models
Datasets
Metrics
Languages
Organizations
Pricing
Community
Forum
Blog
Documentation
Model Hub doc
Inference API doc
Transformers doc
Tokenizers doc
Datasets doc
Back to tag list
Tasks
Clear
Fill-Mask
Question Answering
Summarization
Table Question Answering
Text Classification
Text Generation
Text2Text Generation
Token Classification
Translation
Zero-Shot Classification
Conversational
Text-to-Speech
Automatic Speech Recognition
Audio Source Separation
Voice Activity Detection
+ 5
Back to tag list
Libraries
Clear All
PyTorch
TensorFlow
Rust
Flair
Asteroid
TF SavedModel
ESPnet
TF Lite
Pyannote
Timm
ONNX
+ 9
Back to tag list
Datasets
Clear All
wikipedia
common_voice
dcep europarl jrc-acquis
squad
bookcorpus
c4
CLUECorpusSmall
parsinlu
oscar
squad_v2
cnn_dailymail
imagenet
conll2003
librispeech_asr
PropBank.Br
jrc-acquis
xsum
OSIAN
1.5B Arabic Corpus
gigaword
natural_questions
imagenet-21k
ontonotes
multi_nli
OSCAR Arabic Unshuffled
brWaC
wikisql
mustc
openslr
CoNLL-2012
lince
Indo4B
covost2
snli
OPUS
code_search_net
wmt19
sep_clean
mnli
xnli
blended_skill_talk
OpenLegalData
wtq
twitter
tab_fact
msr_sqa
librispeech
enh_single
ai-soco
mc4
common_crawl
xtreme
fever
trivia_qa
imdb
race
samsum
arabic_billion_words
open_subtitles
Libri1Mix
sep_noisy
openwebtext
flaubert
DAGW
piaf
emotion
cc100
quoref
docred
gap
winograd_wsc
winogender
glue
ag_news
OpenSLR
ms_marco
squad2
SAIL 2017
biomedical literature from Scielo and Pubmed
openbookqa
w11wo/imdb-javanese
Libri2Mix
Libri3Mix
web_questions
wiki_dpr
cc_news
FQuAD
SQuAD-FR
PubMed
id_liputan6
reddit singapore, malaysia
hardwarezone
interspeech_2021_asr
array of dataset identifiers
opus100
jsut
wmt16
ComVE
voxceleb
dihard
wham
Universal Dependencies
commonsenseqa
arc
qqp
the Pile
id_newspapers_2018
BFD
STSbenchmark
dindebat.dk
hestenettet.dk
danish OpenSubtitles
AI4Bharat IndicNLP Corpora
anli
mlqa
MIMIC-III
Wikipedia
go_emotions
tydiqa
pubmed
arabic_speech_corpus
fquad
common_gen
scientific_papers
NQ
Trivia
SQuAD
MLQA
DRCD
NST Swedish ASR Database
arcd
Indonesian Wikipedia
MS MARCO document ranking
germeval_14
wikipedia-turkish
mulit_nli
MNLI
wmt14
wer
TQUAD
timit_asr
indosum
EMBO/sd-panels
CSS10
NSC2018
sts
scancode-rules
trec
Twitter
IndianPolitics
conll2000
ljspeech
muchocine
break_data
https://arabicspeech.org/
sst-2
Shuffled Dutch section of the OSCAR corpus (https://oscar-corpus.com/)
msmarco
yahoo-answers
SciDocs
Uniref100
squad_v1
220M words (IndoWiki, IndoWC, News)
Marefa-NER
nadi
eli5
parlament_parla
multi_nli_mismatch
CommonCrawl
coqa
triviaqa
mlsum
Jean-Baptiste/wikiner_fr
Wikihow
UniRef50
Arabic poetry from several eras
Yves/fhnw_swiss_parliament
xquad
Interspeech 2021
tweets_hate_speech_detection
discofuse
xsum_nl
webqa
dureader
bioASQ
yelp_polarity
emo
sail
100GB Chinese corpus
susumu2357/squad_v2_sv
Squad
XQuad
Tydiqa
HARD-Arabic-Dataset
movies
quora
cord19
vivos
arxiv_dataset
custom-book-corpus
sms_spam
wiki-mk
time-mk-news-2010-2015
quartz
Spotify Podcasts Dataset
squad_v1_pt
legal entity recognition
MLSUM
masakhaner
Arabic Poetry Dataset (6th - 21st century)
created a new dataset based on https://www.openslr.org/92/
ai2_arc
Indic TTS Malayalam Speech Corpus
Openslr Malayalam Speech Corpus
SMC Malayalam Speech Corpus
IIIT-H Indic Speech Databases
shemo
common_voice, infore_25h
Wikipedia (Hindi, Sanskrit, Gujarati)
cifar10
google_wellformed_query
RuSentiment
Arabic Wikipedia
marefa-mt
quotes-500K
LJSpeech
LibriTTS
mlsum - es
EMBO/sd-nlp
common_voice mn
socian
bangla-sentiment-benchmark
L3CubeMahaSent
bible_para
XSUM
Gigaword
codexglue
ALFFA,Gamayun & IWSLT
google
Oscar Corpus, News, Stories
BembaSpeech
mgb5
event2Mind
openslr_hindi
Icelandic portion of the OSCAR corpus from INRIA
JW300 + [Menyo-20k](https://huggingface.co/datasets/menyo20k_mt)
EMBO/biolang
augmented_codesearchnet
pytorrent
JW300
iapp-wiki-qa-dataset
XQuAD
Finnish parliament session 2
CC100
kazakh_speech_corpus
custom danish dataset
https://github.com/wangcunxiang/SemEval2020-Task4-Commonsense-Validation-and-Explanation
coco
RuTweetCorp
RuReviews
fon_dataset
https://github.com/staeiou/arxiv_archive/tree/v1.0.1
Tesserae
Phi5
Thomas Aquinas
algebra_linear_1d
algebra_linear_1d_composed
measurement_time
numbers_gcd
kowiki
news
OpenSLR 77
DaNE
legal
DAMP-VSEP
tatoeba
setimes
csmsc
SUC 3.0
sqa
libri1mix
qasc
quarel
CC-aligned
TACDataset
ami
voxconverse
Farasa
urdu-text-news
indic tts
iiith
Voicebank
DEMAND
WHAM!
WHAMR!
WSJ0-2Mix
WSJ0-3Mix
Timers and Such
wikimovies
imagenet_21k
NST Estonian ASR Database
+ 293
Back to tag list
Languages
Clear All
en
es
fr
sv
de
fi
multilingual
zh
ru
ar
fa
it
id
pt
tr
nl
uk
eo
ja
pl
da
Chinese
bg
ro
he
el
hi
ca
et
no
cs
lt
af
vi
hu
sl
is
ms
ko
ht
hr
mr
tl
bn
mt
lv
gl
gu
mk
rw
sg
eu
ig
ur
lg
ny
or
sn
xh
ee
ts
ln
yo
as
si
mn
rn
th
ga
be
jv
sm
ta
ty
to
nso
fy
ha
lb
sq
te
yi
nb
fj
nn
gaa
bcl
crs
niu
tn
guw
co
wa
ceb
cy
ka
st
br
mh
fo
ilo
bzs
iso
efi
lua
pis
gil
pap
pon
pag
rm
oc
an
am
hy
sk
zu
ti
tw
bem
kg
loz
lus
tvl
swc
tll
ml
english
gv
bi
ase
war
lu
hil
lue
gd
km
kn
so
os
ps
se
kqn
srn
toi
mg
tt
wo
kw
tiv
ho
wls
zne
run
tpi
ne
az
cv
kwy
sc
ber
tum
chk
mfe
yap
rnd
ve
c++
mi
sw
tk
dv
yue
mos
sh
roa
my
code
protein
ky
lo
su
na
ba
sa
pa-IN
fy-NL
kk
bo
bs
pa
sr
sla
sv-SE
kl
io
ce
ab
gem
ISO 639-1 code for your language, or `multilingual`
luo
fiu
ine
gmq
zle
itc
umb
zls
gmw
cel
afa
iir
sem
cpp
urj
inc
zlw
French
sah
hsb
la
eng
jap
ch
gn
nv
mul
zh-tw
lzh
py
ga-IE
trk
grk
bat
kj
om
phi
ss
csg
euq
kwn
nyk
fse
ng
lun
csn
bnt
pqe
aed
alv
dra
aav
cpf
cus
mkh
nic
sal
mfs
prl
tzo
zai
thai
Cszech
Deustch
Swedish
Cszech Deustch
Cszech English
Cszech Spanish
Cszech French
Cszech Italian
Cszech Swedish
Deustch Cszech
Deustch English
Deustch Spanish
Deustch French
Deustch Italian
Deustch Swedish
English Cszech
English Deustch
English Italian
French Cszech
French Deustch
French English
French Spanish
French Italian
French Swedish
Italian Cszech
Italian Deustch
Italian English
Italian Spanish
Italian French
Italian Swedish
Swedish Cszech
Swedish Deustch
Swedish English
Swedish Spanish
Swedish French
Swedish Italian
ia
rm-sursilv
cnr
hbs
haw
hmn
ku
tg
ug
uz
vn
dutch
italian
scientific english
???
hi-en
ks
sd
[en]
amh
hau
ibo
kin
lug
pcm
swa
wol
yor
nah specifically ncj
zh-HK
Guj
scn
nap
ary
ssp
tut
cau
kab
scandinavia
art
ccs
map
poz
pqw
sit
tdt
yua
sami
taw
vsl
wal
ach
rm-vallader
fon
Go
Java
javascript
php
python
en de nl es
cnh
nr
sot
ven
xho
zul
arz
esperanto
xal
Spanish Cszech
Spanish Deustch
Spanish French
Spanish Italian
Spanish Swedish
Spanish English
English French
English Spanish
English Swedish
+ 368
Back to tag list
Licenses
Clear All
apache-2.0
mit
gpl-3.0
cc-by-sa-3.0
cc-by-4.0
cc by-nc-sa 4.0
apache 2.0
cc-by-sa-4.0
cc-by-nc-4.0
public domain notice
cc-by 4.0
any valid license identifier
apache-2
apache license 2.0
apache
attribution-sharealike 4.0 international
gnu gplv3
cc0
+ 16
Models
12
Sort:
Most Downloads
Most Downloads
Alphabetical
Recently Updated
setu4993/LaBSE
Updated
Jan 12
•
1,433
Helsinki-NLP/opus-mt-en-mul
Translation
•
Updated
Jan 18
•
312
Helsinki-NLP/opus-mt-mul-en
Translation
•
Updated
Aug 21, 2020
•
227
Helsinki-NLP/opus-mt-sq-en
Translation
•
Updated
Aug 21, 2020
•
61
Helsinki-NLP/opus-mt-ine-en
Translation
•
Updated
Aug 21, 2020
•
53
Helsinki-NLP/opus-mt-en-ine
Translation
•
Updated
Jan 18
•
51
Helsinki-NLP/opus-mt-en-sq
Translation
•
Updated
Jan 18
•
43
Helsinki-NLP/opus-mt-fi-sq
Translation
•
Updated
Jan 18
Helsinki-NLP/opus-mt-ine-ine
Translation
•
Updated
Aug 21, 2020
Helsinki-NLP/opus-mt-sq-es
Translation
•
Updated
Aug 21, 2020
Helsinki-NLP/opus-mt-sq-sv
Translation
•
Updated
Aug 21, 2020
Helsinki-NLP/opus-mt-sv-sq
Translation
•
Updated
Aug 21, 2020