@macadeliccc on Hugging Face: "Create synthetic instruction datasets using open source LLM's and bonito🐟!…"

Join the community of Machine Learners and AI enthusiasts.

macadeliccc

posted an update Mar 8, 2024

Post

Create synthetic instruction datasets using open source LLM's and bonito🐟!

With Bonito, you can generate synthetic datasets for a wide variety of supported tasks.

The Bonito model introduces a novel approach for conditional task generation, transforming unannotated text into task-specific training datasets to facilitate zero-shot adaptation of large language models on specialized data.

This methodology not only improves the adaptability of LLMs to new domains but also showcases the effectiveness of synthetic instruction tuning datasets in achieving substantial performance gains.

AutoBonito🐟: https://colab.research.google.com/drive/1l9zh_VX0X4ylbzpGckCjH5yEflFsLW04?usp=sharing
Original Repo: https://github.com/BatsResearch/bonito?tab=readme-ov-file
Paper: Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation (2402.18334)

NePe

Mar 8, 2024

•

The colab link seems to be for hqq quantization.

Mar 8, 2024

In this post