Datasets based on UltraFeedback

argilla 's Collections

Notus 7B v1

Notux 8x7B v1

Preference Datasets for DPO

Datasets built with ⚗️ distilabel

Other Notus experiments

2024 Paper Reading Sessions

DIBT Prompt collective SPIN

Preference Datasets for KTO

DIBT-SPIN

Domain Specific Data

updated Mar 19

This collection contains some datasets created on top of UltraFeedback using Argilla for the dataset exploration and curation, sorted by release date.

Upvote

argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 1.48k • 58

Note Curated dataset on top of `openbmb/UltraFeedback` applying binarization to generate a dataset suitable for DPO fine-tuning inspired by HuggingFace H4 previous efforts, but applying a completely different approach on data binarization based on the mean of the preference ratings instead of in the overall score of the critique. Additionally, an extensive data exploration and curation with Argilla was applied, to identify some potential issues within the original dataset
argilla/ultrafeedback-binarized-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 60.9k • 3.84k • 103

Note Iteration on top of `argilla/ultrafeedback-binarized-preferences` but removing the TruthfulQA prompts that were introducing some data contamination as spotted by AllenAI, but keeping Argilla's approach on the data binarization Formatting: the dataset follows the same formatting as the one defined within the Alignment Handbook from HuggingFace H4
argilla/ultrafeedback-multi-binarized-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 158k • 15 • 5

Note Built on top of `openbmb/UltraFeedback` following the same approach as for `argilla/ultrafeedback-binarized-preferences-cleaned` but keeping all the rejected samples so that we end up ~ 3 times more examples to use during fine-tuning for DPO Formatting: the dataset follows the same formatting as the one defined within the Alignment Handbook from HuggingFace H4
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned

Viewer • Updated Dec 11, 2023 • 155k • 4 • 4

Note A simpler iteration on top of `argilla/ultrafeedback-multi-binarized-preferences-cleaned` but removing the low quality samples i.e. mean preference rating for the chosen pairs lower than 3.0 Formatting: the dataset follows the same formatting as the one defined within the Alignment Handbook from HuggingFace H4
argilla/ultrafeedback-curated

Viewer • Updated Dec 13, 2023 • 64k • 5 • 18

Note Another iteration on top of `openbmb/UltraFeedback` aiming to solve the issue with the critiques with an overall score of 10.0, coming from a bug within the UltraFeedback code, where the 1.0 ratings were computed as 10.0. Using `distilabel` with GPT-4 we've generated the critique and ratings again for those with score 10.0 and corrected the ones with score 1.0 Formatting: the dataset follows the same formatting as `openbmb/UltraFeedback` with an additional column to identify the updated rows
argilla/ultrafeedback-binarized-preferences-cleaned-kto

Viewer • Updated Mar 19 • 231k • 250 • 5

Upvote