Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
davanstrienΒ 
posted an update Mar 15
Post
KTO offers an easier way to preference train LLMs (only πŸ‘πŸ‘Ž ratings are required). As part of #DataIsBetterTogether, I've written a tutorial on creating a preference dataset using Argilla and Spaces.

Using this approach, you can create a dataset that anyone with a Hugging Face account can contribute to 🀯

See an example of the kind of Space you can create following this tutorial here: davanstrien/haiku-preferences

πŸ†• New tutorial covers:
πŸ’¬ Generating responses with open models
πŸ‘₯ Collecting human feedback (do you like this model response? Yes/No)
πŸ€– Preparing a TRL-compatible dataset for training aligned models

Check it out here: https://github.com/huggingface/data-is-better-together/tree/main/kto-preference

I see

The current notebooks and code currently only show how to generate the synthetic data and create a preference dataset annotation Space. The next steps would be to collect human feedback on the synthetic data and then use this to train a model. We will cover this in a future notebook.

Is there a future notebook with this content already?

Β·

Hopefully I'll have something to share for this soon! I still need to do some more annotating!