Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
macadeliccc 
posted an update Mar 8
Post
Create synthetic instruction datasets using open source LLM's and bonito🐟!

With Bonito, you can generate synthetic datasets for a wide variety of supported tasks.

The Bonito model introduces a novel approach for conditional task generation, transforming unannotated text into task-specific training datasets to facilitate zero-shot adaptation of large language models on specialized data.

This methodology not only improves the adaptability of LLMs to new domains but also showcases the effectiveness of synthetic instruction tuning datasets in achieving substantial performance gains.

AutoBonito🐟: https://colab.research.google.com/drive/1l9zh_VX0X4ylbzpGckCjH5yEflFsLW04?usp=sharing
Original Repo: https://github.com/BatsResearch/bonito?tab=readme-ov-file
Paper: Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation (2402.18334)

The colab link seems to be for hqq quantization.

·

thanks for letting me know. I have updated the post with the correct link https://colab.research.google.com/drive/1l9zh_VX0X4ylbzpGckCjH5yEflFsLW04?usp=sharing

In this post