Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
macadeliccc 
posted an update Feb 13
Post
Fine-tune 7B models on free-tier Colab hardware using Unsloth 🦥

Unsloth is a framework for fine tuning language models boasting a 0% loss in accuracy while using no approximation methods. They offer a trainer for both supervised fine-tuning (SFT) and direct preference optimization (DPO) that can increase speed of fine-tuning by up to 5x.
This is achieved by adding LoRa adapters. This way they only need to train 1 to 10% of the total parameters. You can export a LoRa adapter or merge to 16-bit for a full finetune. The resulting model is prepared for use in vLLM for faster inference.

Additionally, Huggingface has integrated Unsloth into the documentation for DPO training and reported 18.6% performance gains on T4.

This sets a new standard for fine-tuning large language models. If you would like to explore this methodology for yourself I have provided a notebook "AutoSloth," where you can fine tune using either SFT or DPO and it will upload to HF with a prefilled Unsloth README 🦥 and a Q8_0 quantization.

The SFT example is set up for free tier usage, but the DPO example is set up for an A100. The DPO example can be altered to work on T4 but I wanted to include more than one example.

Colab Stats during training:
+ Model: unsloth/mistral-7b-bnb-4bit
+ Dataset: yahma/alpaca-cleaned
+ Batch size: 2
+ Gradient steps: 4
+ System RAM: 8.5 / 51.0 GB
+ VRAM (T4): 13.6 / 15.0 GB

Resources:
🦥Unsloth: https://github.com/unslothai/unsloth
🦥AutoSloth: https://colab.research.google.com/drive/1Zo0sVEb2lqdsUm9dy2PTzGySxdF9CNkc?usp=sharing
🤗HF-Unsloth-docs: https://huggingface.co/docs/trl/main/en/dpo_trainer#accelerate-dpo-fine-tuning-using-unsloth
🤗HF-Unsloth Blog Post: https://huggingface.co/blog/unsloth-trl

Is all-linear (most recent update of PEFT) supported in the target_modules arg ? Also what about NEFTune ?

·

To my knowledge that is not currently supported. I have provided the training arguments that unsloth documents, but I will do some more research and testing to provide a more comprehensive answer.

Here's the unsloth docs that show the supported training arguments https://github.com/unslothai/unsloth/tree/main?tab=readme-ov-file#-documentation

Great work !

There's no reason to not use unsloth if it fits
Amazing ❤️

·

Thank you! 🤗