# LLM Fine-Tuning with QLoRA This repository can help to instruct-tune Open LLaMA, RedPajama or StableLM models on consumer hardware using QLoRA (Original implementation [here](https://github.com/artidoro/qlora)). It's mostly based on the original alpaca-lora repo which can be found [here](https://github.com/tloen/alpaca-lora). Please note that this has only been tested on Open LLama 3b and RedPajama 3b Models, but should work with other models. Contributions are welcome! ### Local Setup 1. Install dependencies ```bash pip install -r requirements.txt ``` 1. If bitsandbytes doesn't work, [install it from source.](https://github.com/TimDettmers/bitsandbytes/blob/main/compile_from_source.md) Windows users can follow [these instructions](https://github.com/tloen/alpaca-lora/issues/17). ## Training (finetune.py) This file contains a straightforward application of QLoRA PEFT to the Open LLaMA / RedPajama / StableLM model, as well as some code related to prompt construction and tokenization. PRs adapting this code to support larger models are always welcome. **Example usage:** For Open LLaMa python finetune.py \ --base_model 'openlm-research/open_llama_3b_600bt_preview' \ --data_path '../datasets/dolly.json' \ --num_epochs=3 \ --cutoff_len=512 \ --group_by_length \ --output_dir='./dolly-lora-3b' \ --lora_r=16 \ --lora_target_modules='[q_proj,v_proj]' For RedPajama python finetune.py \ --base_model='togethercomputer/RedPajama-INCITE-Base-3B-v1' \ --data_path='../datasets/dolly.json' \ --num_epochs=3 \ --cutoff_len=512 \ --group_by_length \ --output_dir='./dolly-lora-rp-3b-t1' \ --lora_r=16 \ --lora_target_modules='["query_key_value"]' For StableLM python finetune.py \ --base_model='stabilityai/stablelm-base-alpha-3b' \ --data_path='../datasets/dolly.json' \ --num_epochs=3 \ --cutoff_len=512 \ --group_by_length \ --output_dir='./dolly-lora-st-3b-t1' \ --lora_r=16 \ --lora_target_modules='["query_key_value"]' For Pythia python finetune.py \ --base_model='EleutherAI/pythia-6.9b-deduped' \ --data_path='../datasets/dolly.json' \ --num_epochs=1 \ --cutoff_len=512 \ --group_by_length \ --output_dir='./dolly-lora-pyt-6b-t1' \ --lora_r=8 \ --lora_target_modules='["query_key_value"]' We can also tweak our hyperparameters (similar to alpaca-lora): python finetune.py \ --base_model 'openlm-research/open_llama_3b_600bt_preview \ --data_path 'yahma/alpaca-cleaned' \ --output_dir './lora-alpaca' \ --batch_size 128 \ --micro_batch_size 4 \ --num_epochs 3 \ --learning_rate 1e-4 \ --cutoff_len 512 \ --val_set_size 2000 \ --lora_r 8 \ --lora_alpha 16 \ --lora_dropout 0.05 \ --lora_target_modules '[q_proj,v_proj]' \ --train_on_inputs \ --group_by_length ## Inference (generate.py) This file reads the foundation model from the Hugging Face model hub and the LoRA weights from trained peft model, and runs a Gradio interface for inference on a specified input. Users should treat this as example code for the use of the model, and modify it as needed. Example usage: For Open LLaMa python generate.py \ --base_model 'openlm-research/open_llama_3b_600bt_preview' \ --lora_weights './lora-alpaca' For RedPajama python generate.py \ --base_model 'togethercomputer/RedPajama-INCITE-Base-3B-v1' \ --lora_weights './dolly-lora-rp-3b-t1/' For StableLM python generate.py \ --base_model 'stabilityai/stablelm-base-alpha-3b' \ --lora_weights './dolly-lora-st-3b-t1' For Pythia python generate.py \ --base_model 'EleutherAI/pythia-6.9b-deduped' \ --lora_weights './dolly-lora-pyt-6b-t1' # Acknowledgements We would like to express our heartfelt gratitude to **Meta** for releasing LLaMA . Without this pioneering technology, the foundations of projects like **Open Llama** and **Alpaca** wouldn't exist. We sincerely appreciate the immense contributions you've made to the field. Our acknowledgements also extend to the teams behind **Open LLaMA**, **Together Computer**, **Alpaca** and **Alpaca LoRA**.. You can find more about their excellent work on their respective GitHub repositories: - [Open Llama](https://github.com/openlm-research/open_llama) - [Together Computer](https://github.com/togethercomputer) - [Alpaca](https://github.com/tatsu-lab/stanford_alpaca) - [Alpaca LoRa](https://github.com/tloen/alpaca-lora) Lastly, we would like to express our thanks to the developers of **QLoRA** and **bitsandbytes** Your efforts have been instrumental in advancing the field, and we're grateful for your contributions. More information about these projects can be found at: - [QLoRA](https://github.com/artidoro/qlora) - [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) Thank you all for your commitment to innovation and for making these projects possible.