Awesome feedback datasets

HuggingFaceH4 's Collections

Scaling Test Time Compute with Open Models

Zephyr ORPO

Zephyr 7B

Zephyr 7B Gemma

StarChat2 15B

Journal Club

Papers We've Read

Awesome SFT datasets

Awesome feedback datasets

Awesome reward models

updated Apr 12

A curated list of datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO.

Upvote

Anthropic/hh-rlhf

Viewer • Updated May 26, 2023 • 169k • 11k • 1.23k

Note The OG of open preference data. Not super great quality
berkeley-nest/Nectar

Viewer • Updated Mar 20 • 183k • 393 • 280

Note NC
openbmb/UltraFeedback

Viewer • Updated Dec 29, 2023 • 64k • 1.34k • 341
Intel/orca_dpo_pairs

Viewer • Updated Nov 29, 2023 • 12.9k • 1.68k • 291

Note A simple idea to just generate GPT-4 responses and treat them as preferred response.
Hello-SimpleAI/HC3

Viewer • Updated Jan 21, 2023 • 48.6k • 1.17k • 184

Note Some parts are NC
lvwerra/stack-exchange-paired

Viewer • Updated Mar 13, 2023 • 31.3M • 2.28k • 142
argilla/ultrafeedback-binarized-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 60.9k • 9.45k • 127

Note Computes the mean preference score instead of relying on the overall score from GPT-4. Also removes contamination from TruthfulQA prompts
nvidia/HelpSteer

Viewer • Updated 3 days ago • 37.1k • 1.86k • 226

Note The dataset behind NVIDIA's SteerLM alignment method
jondurbin/truthy-dpo-v0.1

Viewer • Updated Jan 11 • 1.02k • 623 • 131
lmsys/chatbot_arena_conversations

Viewer • Updated Sep 30, 2023 • 33k • 1.26k • 355
hbXNov/sparse_feedback

Updated Aug 31, 2023 • 50 • 5
Unified-Language-Model-Alignment/Anthropic_HH_Golden

Viewer • Updated Oct 4, 2023 • 44.8k • 439 • 31
neovalle/H4rmony

Viewer • Updated Apr 30 • 2.02k • 85 • 15
PKU-Alignment/PKU-SafeRLHF

Viewer • Updated Oct 18 • 164k • 3.96k • 120
peiyi9979/Math-Shepherd

Viewer • Updated Jan 3 • 445k • 664 • 78
m-a-p/Code-Feedback

Viewer • Updated Feb 26 • 66.4k • 195 • 198
introspector/unimath

Updated Feb 12 • 5.37k • 7
abacusai/MetaMath_DPO_FewShot

Viewer • Updated Feb 26 • 395k • 107 • 26
interstellarninja/tool-calls-dpo

Viewer • Updated Jan 23 • 235 • 69 • 7

Upvote