Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
1
Languages
Licenses
Other
Reset Datasets
dataset:tomekkorbak/detoxify-pile-chunk3-1200000-1250000
Datasets with no match
mozilla-foundation/common_voice_7_0
mozilla-foundation/common_voice_11_0
HuggingFaceH4/ultrafeedback_binarized
Open-Orca/OpenOrca
teknium/OpenHermes-2.5
marsyas/gtzan
LDJnr/Capybara
HuggingFaceH4/ultrachat_200k
allenai/c4
fka/awesome-chatgpt-prompts
Open-Orca/SlimOrca
Intel/orca_dpo_pairs
garage-bAInd/Open-Platypus
bigcode/starcoderdata
OpenAssistant/oasst1
microsoft/orca-math-word-problems-200k
TIGER-Lab/MathInstruct
cerebras/SlimPajama-627B
databricks/databricks-dolly-15k
jondurbin/airoboros-2.2.1
mozilla-foundation/common_voice_13_0
facebook/voxpopuli
meta-math/MetaMathQA
m-a-p/CodeFeedback-Filtered-Instruction
allenai/ultrafeedback_binarized_cleaned
jondurbin/airoboros-3.2
PolyAI/minds14
teknium/openhermes
migtissera/Synthia-v1.3
google/fleurs
cognitivecomputations/dolphin-coder
jondurbin/truthy-dpo-v0.1
lmsys/lmsys-chat-1m
jondurbin/gutenberg-dpo-v0.1
tiiuae/falcon-refinedweb
Vezora/Tested-22k-Python-Alpaca
Anthropic/hh-rlhf
jondurbin/cinematika-v0.1
huggan/smithsonian_butterflies_subset
ise-uiuc/Magicoder-Evol-Instruct-110K
cognitivecomputations/samantha-data
mozilla-foundation/common_voice_8_0
togethercomputer/RedPajama-Data-1T
tatsu-lab/alpaca
uonlp/CulturaX
yahma/alpaca-cleaned
Muennighoff/natural-instructions
codeparrot/apps
facebook/belebele
internlm/Agent-FLAN
camel-ai/math
ise-uiuc/Magicoder-OSS-Instruct-75K
camel-ai/biology
camel-ai/physics
HuggingFaceH4/no_robots
argilla/distilabel-intel-orca-dpo-pairs
camel-ai/chemistry
kingbri/PIPPA-shareGPT
cognitivecomputations/Dolphin-2.9
nvidia/HelpSteer
Locutusque/function-calling-chatml
cais/mmlu
grimulkan/LimaRP-augmented
unalignment/toxic-dpo-v0.2
glaiveai/glaive-function-calling-v2
HuggingFaceTB/cosmopedia
unalignment/toxic-dpo-v0.1
OpenAssistant/oasst_top1_2023-08-25
EleutherAI/pile
lemonilia/LimaRP
jondurbin/airoboros-3.1
LDJnr/Verified-Camel
HuggingFaceFW/fineweb
mattpscott/airoboros-summarization
relbert/semeval2012_relational_similarity
b-mc2/sql-create-context
PygmalionAI/PIPPA
mozilla-foundation/common_voice_16_1
openchat/openchat_sharegpt4_dataset
m-a-p/Code-Feedback
GAIR/lima
LDJnr/Pure-Dove
WhiteRabbitNeo/WRN-Chapter-1
unicamp-dl/mmarco
stanfordnlp/SHP
WhiteRabbitNeo/WRN-Chapter-2
argilla/distilabel-capybara-dpo-7k-binarized
unalignment/spicy-3.1
Open-Orca/SlimOrca-Dedup
mozilla-foundation/common_voice_16_0
metaeval/reclor
mlabonne/orpo-dpo-mix-40k
jondurbin/py-dpo-v0.1
derek-thomas/ScienceQA
jondurbin/airoboros-gpt4-1.4.1
argilla/dpo-mix-7k
jondurbin/contextual-dpo-v0.1
anon8231489123/ShareGPT_Vicuna_unfiltered
abacusai/SystemChat-1.1
Squish42/bluemoon-fandom-1-1-rp-cleaned
jondurbin/airoboros-2.2
BigTMiami/amazon_MICRO_helpfulness_dataset
ParisNeo/lollms_aware_dataset
esb/datasets
bigcode/the-stack-dedup
Lajonbot/alpaca-dolly-chrisociepa-instruction-only-polish
LDJnr/LessWrong-Amplify-Instruct
flax-sentence-embeddings/stackexchange_xml
embedding-data/sentence-compression
HPLT/hplt_monolingual_v1_2
wikimedia/wikipedia
vicgalle/alpaca-gpt4
liuhaotian/LLaVA-Instruct-150K
timdettmers/openassistant-guanaco
stingning/ultrachat
gokuls/wiki_book_corpus_complete_processed_bert_dataset
berkeley-nest/Nectar
Lin-Chen/ShareGPT4V
openai/summarize_from_feedback
MBZUAI/Bactrian-X
embedding-data/PAQ_pairs
embedding-data/WikiAnswers
detection-datasets/coco
dell-research-harvard/AmericanStories
bigcode/commitpackft
allenai/nllb
openbmb/UltraFeedback
argilla/ultrafeedback-binarized-preferences-cleaned
varun-v-rao/squad
teknium/GPTeacher-General-Instruct
cognitivecomputations/dolphin
allenai/ai2_arc
lmqg/qg_squad
embedding-data/altlex
embedding-data/simple-wiki
totally-not-an-llm/EverythingLM-data-V3
HuggingFaceH4/deita-10k-v0-sft
HuggingFaceM4/WebSight
kejian/codeparrot-train-more-filter-3.3b-cleaned
lmqg/qg_subjqa
stefan-it/co-funer
gretelai/synthetic_text_to_sql
mozilla-foundation/common_voice_9_0
embedding-data/SPECTER
allenai/tulu-v2-sft-mixture
oscar-corpus/OSCAR-2301
glaiveai/glaive-code-assistant
bugdaryan/sql-create-context-instruction
glaiveai/glaive-code-assistant-v3
NbAiLab/NPSC
tner/tweetner7
QingyiSi/Alpaca-CoT
sablo/oasst2_curated
sahil2801/CodeAlpaca-20k
open-web-math/open-web-math
bigbio/med_qa
STEM-AI-mtl/Electrical-engineering
abacusai/SystemChat
bigcode/the-stack
JosephusCheung/GuanacoDataset
LDJnr/Puffin
fblgit/tree-of-knowledge
argilla/distilabel-math-preference-dpo
mlabonne/chatml_dpo_pairs
allenai/dolma
LeoLM/OpenSchnabeltier
Rogendo/English-Swahili-Sentence-Pairs
LeoLM/German_Poems
LeoLM/German_Songs
liuhaotian/LLaVA-Pretrain
HuggingFaceH4/cai-conversation-harmless
BigTMiami/amazon_helpfulness
flammenai/FlameMix-DPO-v1
flammenai/Grill-preprod-v1_chatML
Norquinal/claude_multiround_chat_30k
mwitiderrick/SwahiliPlatypus
jondurbin/airoboros-gpt4-m2.0
beomi/KoAlpaca-v1.1a
allenai/MADLAD-400
mozilla-foundation/common_voice_17_0
grimulkan/theory-of-mind
facebook/multilingual_librispeech
roneneldan/TinyStories
Gustavosta/Stable-Diffusion-Prompts
THUDM/AgentInstruct
tiedong/goat
TFMC/imatrix-dataset-for-japanese-llm
yhavinga/mc4_nl_cleaned
lmqg/qg_squadshifts
Nerfgun3/bad_prompt
meta-math/MetaMathQA-40K
knowrohit07/saraswati-stem
flammenai/Grill-preprod-v2_chatML
JeanKaddour/minipile
Doctor-Shotgun/no-robots-sharegpt
argilla/ultrafeedback-binarized-preferences
mlabonne/guanaco-llama2-1k
athirdpath/DPO_Pairs-Roleplay-Alpaca-NSFW
Doctor-Shotgun/capybara-sharegpt
NbAiLab/NST
hkust-nlp/deita-10k-v0
CohereForAI/aya_dataset
wenbopan/Fusang-v1
wenbopan/OpenOrca-zh-20k
Hello-SimpleAI/HC3
kunishou/databricks-dolly-15k-ja
gsdf/EasyNegative
NeuralNovel/Neural-Story-v1
rhaymison/superset
ajibawa-2023/Code-290k-ShareGPT
NeelNanda/pile-10k
kmfoda/booksum
lambdalabs/pokemon-blip-captions
laion/laion2B-en
IlyaGusev/ru_turbo_alpaca
chargoddard/rpguild
netcat420/MFANN
codeparrot/github-code-clean
IlyaGusev/ru_turbo_saiga
lmqg/qg_jaquad
OpenAssistant/oasst2
openai/webgpt_comparisons
RyokoAI/ShareGPT52K
IlyaGusev/ru_sharegpt_cleaned
iamtarun/python_code_instructions_18k_alpaca
jondurbin/airoboros-2.1
FreedomIntelligence/alpaca-gpt4-deutsch
FreedomIntelligence/evol-instruct-deutsch
llama-duo/synth_summarize_dataset_dedup
lksy/ru_instruct_gpt4
augmxnt/ultra-orca-boros-en-ja-v1
jytjyt05/t_to_m7
ymoslem/IWSLT2023-GA-EN
csebuetnlp/xlsum
lmqg/qg_esquad
lmqg/qg_ruquad
EleutherAI/the_pile_deduplicated
allenai/objaverse
OpenAssistant/OASST-DE
laion/laion-coco
IlyaGusev/oasst1_ru_main_branch
BAAI/COIG
bigcode/guanaco-commits
pszemraj/simple_wikipedia_LM
mozilla-foundation/common_voice_15_0
kyujinpy/OpenOrca-KO
math-ai/StackMathQA
nyu-mll/glue
lmqg/qg_itquad
AmazonScience/massive
laion/OIG
emozilla/yarn-train-tokenized-16k-mistral
NobodyExistsOnTheInternet/ToxicQAFinal
HuggingFaceM4/the_cauldron
ymoslem/FLEURS-GA-EN
lmqg/qg_koquad
deepmind/code_contests
mozilla-foundation/common_voice_12_0
iamplus/Instruction_Tuning
dair-ai/emotion
nlpai-lab/kullm-v2
togethercomputer/RedPajama-Data-1T-Sample
imone/OpenOrca_FLAN
maywell/ko_wikidata_QA
allenai/WildChat
ymoslem/BitesizeIrish-GA-EN
mozilla-foundation/common_voice_6_1
kakaobrain/coyo-700m
teknium/GPT4-LLM-Cleaned
nvidia/OpenMathInstruct-1
ymoslem/SpokenWords-GA-EN-MTed
lmqg/qg_dequad
DFKI-SLT/few-nerd
TigerResearch/tigerbot-zhihu-zh-10k
EleutherAI/proof-pile-2
seedboxai/multitask_german_examples_32k
NeuralNovel/Neural-DPO
german-nlp-group/german_common_crawl
mozilla-foundation/common_voice_10_0
flozi00/conversations
mlfoundations/datacomp_1b
ResplendentAI/Synthetic_Soul_1k
abacusai/ARC_DPO_FewShot
MinervaAI/Aesir-Preview
Finnish-NLP/mc4_fi_cleaned
bigscience/xP3
svakulenk0/qrecc
djaym7/wiki_dialog
stjiris/portuguese-legal-sentences-v0
mosaicml/dolly_hhrlhf
theblackcat102/evol-codealpaca-v1
Azure99/blossom-chat-v1
kyujinpy/KOR-OpenOrca-Platypus-v3
Epiculous/Gnosis
SenseLLM/ReflectionSeq-GPT
SenseLLM/ReflectionSeq-DS
flammenai/MahouMix-v1
bigscience/P3
m-a-p/COIG-CQIA
embedding-data/QQP_triplets
Yaxin/SemEval2014Task4Raw
competitions/aiornot
Dahoas/full-hh-rlhf
HuggingFaceM4/OBELICS
kyujinpy/KOpen-platypus
RyokoAI/Fandom23K
milashkaarshif/MoeGirlPedia_wikitext_raw_archive
liwu/MNBVC
snorkelai/Snorkel-Mistral-PairRM-DPO-Dataset
mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha
MaziyarPanahi/WizardLM_evol_instruct_V2_196k
facebook/anli
togethercomputer/RedPajama-Data-V2
shahules786/orca-chat
bertin-project/alpaca-spanish
nomic-ai/gpt4all_prompt_generations
Amod/mental_health_counseling_conversations
Norquinal/claude_multiround_chat_1k
McGill-NLP/WebLINX
CohereForAI/aya_collection
ruslanmv/ai-medical-chatbot
abacusai/HellaSwag_DPO_FewShot
segments/sidewalk-semantic
ydshieh/coco_dataset_script
Salesforce/dialogstudio
IlyaGusev/ru_turbo_alpaca_evol_instruct
reginaboateng/cleaned_ebmnlp_pico
fnlp/moss-003-sft-data
nickrosh/Evol-Instruct-Code-80k-v1
bjoernp/tagesschau-2018-2023
jondurbin/airoboros-3.0
cardiffnlp/super_tweeteval
dennlinger/eur-lex-sum
THUDM/webglm-qa
We-Want-GPU/Yi-Ko-DPO-Orca-DPO-Pairs
Himitsui/Lewd-Assistant-v1
Yukang/LongAlpaca-16k-length
IlyaGusev/gazeta
eugenesiow/Div2k
gsarti/change_it
tals/vitaminc
chizhikchi/CARES
Dahoas/synthetic-instruct-gptj-pairwise
knkarthick/dialogsum
kunishou/hh-rlhf-49k-ja
iamplus/Conversational_Data
PKU-Alignment/PKU-SafeRLHF
pankajmathur/orca_mini_v1_dataset
CollectiveCognition/chats-data-2023-09-27
Severian/Biomimicry
lucas-meyer/asr_af
mattymchen/refinedweb-3m
1aurent/NCT-CRC-HE
1aurent/PatchCamelyon
bjoernp/ultrachat_de
laion/laion2B-multi
wanng/wukong100m
kyujinpy/orca_math_dpo
hltcoe/tdist-msmarco-scores
jondurbin/bagel-v0.3
argilla/OpenHermes2.5-dpo-binarized-alpha
Locutusque/hyperion-v2.0
Locutusque/hercules-v4.0
ajibawa-2023/Children-Stories-Collection
ajibawa-2023/General-Stories-Collection
cognitivecomputations/SystemChat-2.0
eugenesiow/Set5
LIUM/tedlium
ConvLab/multiwoz21
Abirate/english_quotes
HuggingFaceH4/CodeAlpaca_20K
facebook/pmd
jerryjalapeno/nart-100k-synthetic
fmars/wiki_stem
Severian/Bio-Design-Process
lucas-meyer/asr_xh
tiagoblima/qg_squad_v1_pt
glaiveai/glaive-code-assistant-v2
Universal-NER/Pile-NER-type
llm-jp/databricks-dolly-15k-ja
RUCKBReasoning/TableLLM-SFT
Locutusque/UltraTextbooks
vicgalle/configurable-system-prompt-multitask
Severian/Internal-Knowledge-Map
arcee-ai/sec-data-mini
abacusai/MetaMath_DPO_FewShot
deepset/germanquad
eugenesiow/Set14
eugenesiow/BSD100
eugenesiow/Urban100
poloclub/diffusiondb
mbruton/spanish_srl
mbruton/galician_srl
VMware/open-instruct-v1-oasst-dolly-hhrlhf
jondurbin/airoboros-gpt4-1.2
omarmomen/babylm_10M
Xilabs/PIPPA-alpaca
Fredithefish/openassistant-guanaco-unfiltered
BangumiBase/soundeuphonium
TokenBender/code_instructions_122k_alpaca_style
Azure99/blossom-math-v2
Azure99/blossom-wizard-v1
Azure99/blossom-orca-v1
Norquinal/OpenCAI
llm-jp/oasst1-21k-en
llm-jp/oasst1-21k-ja
jpacifico/French-Alpaca-dataset-Instruct-110K
Azure99/blossom-chat-v3
Azure99/blossom-math-v4
Azure99/blossom-wizard-v3
Azure99/blossom-orca-v3
Aratako/Rosebleu-1on1-Dialogues-RP
lodrick-the-lafted/OpusStories
lodrick-the-lafted/Sao10K_Claude-3-Opus-Instruct-3.3K
lodrick-the-lafted/Samantha-Opus
lodrick-the-lafted/Worldsim-Opus
TIGER-Lab/WebInstructSub
H-D-T/Buzz
frgfm/imagenette
speechcolab/gigaspeech
cardiffnlp/tweet_topic_multi
tner/bc5cdr
izumi-lab/llm-japanese-dataset
winglian/evals
VMware/open-instruct
EarthnDusk/Embeddings
PocketDoc/Floyd-Text-Adventures
keivalya/MedQuad-MedicalQnADataset
openbmb/llava_zh
ajibawa-2023/Python-Code-23k-ShareGPT
hkust-nlp/deita-6k-v0
miracl/miracl
NobodyExistsOnTheInternet/full120k
BangumiBase/lapisrelights
rajpurkar/squad
pixparse/pdfa-eng-wds
openbmb/UltraInteract_sft
stanfordnlp/imdb
HaltiaAI/Her-The-Movie-Samantha-and-Theodore-Dataset
JetBrains/KStack-clean
Muennighoff/P3
allenai/scirepeval
allenai/soda
kunishou/oasst1-89k-ja
s3nh/alpaca-dolly-instruction-only-polish
NicolaiSivesind/human-vs-machine
starfishmedical/webGPT_x_dolly
tum-nlp/IDMGSP
jondurbin/airoboros-gpt4-1.4
mlabonne/CodeLlama-2-20k
euirim/goodwiki
PocketDoc/Choose-Your-Story-Long-Text-Adventures
epfl-llm/guidelines
medalpaca/medical_meadow_wikidoc
Trelis/function_calling_v3
adamo1139/AEZAKMI_v2
Locutusque/Hercules-v3.0
Cnam-LMSSC/vibravox
CreitinGameplays/merged-data-v2
tyzhu/lmind_nq_train6000_eval6489_v1_reciteonly_qa_v3
TheSkullery/Aether-Lite-V1.2
mozilla-foundation/common_voice_6_0
cardiffnlp/tweet_topic_single
copenlu/fever_gold_evidence
zeroshot/twitter-financial-news-sentiment
tasksource/mmlu
GBaker/MedQA-USMLE-4-options
nomic-ai/gpt4all-j-prompt-generations
BelleGroup/train_1M_CN
gfissore/arxiv-abstracts-2021
Thaweewat/alpaca-cleaned-52k-th
nicholasKluge/instruct-aira-dataset
jondurbin/airoboros-gpt4-1.3
shareAI/ShareGPT-Chinese-English-90k
CyberHarem/surtr_arknights
pankajmathur/WizardLM_Orca
BangumiBase/seitokaiyakuindomo
khalidalt/tydiqa-goldp
teknium/trismegistus-project
maywell/ko_Ultrafeedback_binarized
vngrs-ai/vngrs-web-corpus
McGill-NLP/WebLINX-full
Cohere/wikipedia-2023-11-embed-multilingual-v3
pixparse/idl-wds
ymoslem/Tatoeba-Speech-Irish
ymoslem/Wikimedia-Speech-Irish
JetBrains/KExercises
Sao10K/Claude-3-Opus-Instruct-15K
hendrycks/competition_math
scikit-learn/iris
edinburghcstr/ami
michelecafagna26/hl
demelin/moral_stories
BAAI/COIG-PC
oscar-corpus/OSCAR-2201
AyoubChLin/CNN_News_Articles_2011-2022
yizhongw/self_instruct
SetFit/bbc-news
shunk031/JGLUE
grammarly/coedit
ewof/koishi-instruct-metharme
IlyaGusev/gpt_roleplay_realm
cosimoiaia/Loquace-102k
vietgpt/wikipedia_vi
CarperAI/pilev2-dev
iamplus/Orca
totally-not-an-llm/EverythingLM-data-V2
numind/NuNER
BangumiBase/fatestaynightufotable
DILAB-HYU/KoQuality
NobodyExistsOnTheInternet/GiftedConvoBeforeEcons
Skylion007/openwebtext
KnutJaegersberg/Auton
AtlasUnified/atlas-math-sets
GraphWiz/GraphInstruct-RFT-72K
hyojin99/EBRC
kunishou/amenokaku-code-instruct
chanelcolgate/yenthienviet
Weyaxi/sci-datasets
0-hero/Matter-0.1
YanweiLi/MGM-Instruction
wenbopan/Chinese-dpo-pairs
openbmb/UltraInteract_pair
wendlerc/RenderedText
BigTMiami/amazon_split_25M_reviews_20_percent_condensed
llm-jp/oasst2-33k-en
llm-jp/oasst2-33k-ja
prince-canuma/fineweb-CC-MAIN-2024-10-1B-en
SkelterLabsInc/JaQuAD
nlpaueb/finer-139
gigant/african_accented_french
reazon-research/reazonspeech
commanderstrife/jnlpba
chintagunta85/ncbi_disease
dmayhem93/ChatCombined
HuggingFaceH4/databricks_dolly_15k
silk-road/alpaca-data-gpt4-chinese
Babelscape/multinerd
OpenLeecher/Teatime
medalpaca/medical_meadow_medqa
v2ray/r-chatgpt-general-dump
BEE-spoke-data/bees-internal
Santp98/query_generated-title-secop2
icybee/share_gpt_90k_v1
Skywork/SkyPile-150B
CausalLM/Refined-Anime-Text
ehristoforu/dalle-3-images
PleIAs/YouTube-Commons
sinhala-nlp/NSINA
gate369/Alpaca-Star
BangumiBase/sousounofrieren
LooksJuicy/ruozhiba
bigcode/self-oss-instruct-sc2-exec-filter-50k
asapp/slue-phase-2
tyzhu/lmind_nq_train6000_eval6489_v1_qa
tyzhu/lmind_hotpot_train8000_eval7405_v1_qa
mozilla-foundation/common_voice_14_0
bigscience/xP3mt
mozilla-foundation/common_voice_3_0
mozilla-foundation/common_voice_4_0
mozilla-foundation/common_voice_1_0
Apply filters
Models
125
Full-text search
Edit filters
Sort: Trending
Active filters:
tomekkorbak/detoxify-pile-chunk3-1200000-1250000
Clear all
kejian/cpsc-log5-bin4-5repeat
Updated
Mar 24, 2023
•
9
kejian/cpsc-log5-bin4-3repeat-v2
Updated
Mar 24, 2023
•
9
kejian/cpsc-wmle-0.85
Updated
Mar 25, 2023
•
9
kejian/cpsc-log5-bin4-5repeat-v2
Updated
Mar 29, 2023
•
9
kejian/cpsc-log5-bin4-3repeat-v3
Updated
Mar 29, 2023
•
9
Previous
1
...
3
4
5
Next