# LLM Fine-Tuning with QLoRA

This repository can help to instruct-tune Open LLaMA, RedPajama or StableLM models on consumer hardware using QLoRA (Original implementation [here](https://github.com/artidoro/qlora)). It's mostly based on the original alpaca-lora repo which can be found [here](https://github.com/tloen/alpaca-lora). Please note that this has only been tested on Open LLama 3b and RedPajama 3b Models, but should work with other models. Contributions are welcome!

### Local Setup

1. Install dependencies

   ```bash
   pip install -r requirements.txt
   ```

1. If bitsandbytes doesn't work, [install it from source.](https://github.com/TimDettmers/bitsandbytes/blob/main/compile_from_source.md) Windows users can follow [these instructions](https://github.com/tloen/alpaca-lora/issues/17).

## Training (finetune.py)

This file contains a straightforward application of QLoRA PEFT to the Open LLaMA / RedPajama / StableLM model, as well as some code related to prompt construction and tokenization. PRs adapting this code to support larger models are always welcome.

**Example usage:**

For Open LLaMa

    python finetune.py \
        --base_model 'openlm-research/open_llama_3b_600bt_preview' \
        --data_path '../datasets/dolly.json' \
        --num_epochs=3 \
        --cutoff_len=512 \
        --group_by_length \
        --output_dir='./dolly-lora-3b' \
        --lora_r=16 \
        --lora_target_modules='[q_proj,v_proj]'

For RedPajama

    python finetune.py   \
    --base_model='togethercomputer/RedPajama-INCITE-Base-3B-v1' \
    --data_path='../datasets/dolly.json'   \
    --num_epochs=3   \
    --cutoff_len=512   \
    --group_by_length   \
    --output_dir='./dolly-lora-rp-3b-t1' \
    --lora_r=16 \
    --lora_target_modules='["query_key_value"]' 
    
For StableLM

    python finetune.py  \
    --base_model='stabilityai/stablelm-base-alpha-3b' \
    --data_path='../datasets/dolly.json' \
    --num_epochs=3 \
    --cutoff_len=512 \
    --group_by_length  \
    --output_dir='./dolly-lora-st-3b-t1' \
    --lora_r=16 \
    --lora_target_modules='["query_key_value"]'
    
For Pythia

    python finetune.py  \
    --base_model='EleutherAI/pythia-6.9b-deduped' \
    --data_path='../datasets/dolly.json' \
    --num_epochs=1 \
    --cutoff_len=512 \
    --group_by_length  \
    --output_dir='./dolly-lora-pyt-6b-t1' \
    --lora_r=8 \
    --lora_target_modules='["query_key_value"]'

We can also tweak our hyperparameters (similar to alpaca-lora):

    python finetune.py \
        --base_model 'openlm-research/open_llama_3b_600bt_preview \
        --data_path 'yahma/alpaca-cleaned' \
        --output_dir './lora-alpaca' \
        --batch_size 128 \
        --micro_batch_size 4 \
        --num_epochs 3 \
        --learning_rate 1e-4 \
        --cutoff_len 512 \
        --val_set_size 2000 \
        --lora_r 8 \
        --lora_alpha 16 \
        --lora_dropout 0.05 \
        --lora_target_modules '[q_proj,v_proj]' \
        --train_on_inputs \
        --group_by_length

## Inference (generate.py)
This file reads the foundation model from the Hugging Face model hub and the LoRA weights from trained peft model, and runs a Gradio interface for inference on a specified input. Users should treat this as example code for the use of the model, and modify it as needed.

Example usage:    

For Open LLaMa

    python generate.py \
        --base_model 'openlm-research/open_llama_3b_600bt_preview' \
        --lora_weights './lora-alpaca'
        
For RedPajama

    python generate.py  \
    --base_model 'togethercomputer/RedPajama-INCITE-Base-3B-v1'  \
    --lora_weights './dolly-lora-rp-3b-t1/'
       
For StableLM

    python generate.py  \
    --base_model 'stabilityai/stablelm-base-alpha-3b' \
    --lora_weights './dolly-lora-st-3b-t1'
    
For Pythia
   
    python generate.py  \
    --base_model 'EleutherAI/pythia-6.9b-deduped'  \
    --lora_weights './dolly-lora-pyt-6b-t1'
    
# Acknowledgements

We would like to express our heartfelt gratitude to **Meta** for releasing LLaMA . Without this pioneering technology, the foundations of projects like **Open Llama** and **Alpaca** wouldn't exist. We sincerely appreciate the immense contributions you've made to the field.

Our acknowledgements also extend to the teams behind **Open LLaMA**, **Together Computer**, **Alpaca** and **Alpaca LoRA**.. You can find more about their excellent work on their respective GitHub repositories:

- [Open Llama](https://github.com/openlm-research/open_llama)
- [Together Computer](https://github.com/togethercomputer)
- [Alpaca](https://github.com/tatsu-lab/stanford_alpaca)
- [Alpaca LoRa](https://github.com/tloen/alpaca-lora)

Lastly, we would like to express our thanks to the developers of **QLoRA** and **bitsandbytes** Your efforts have been instrumental in advancing the field, and we're grateful for your contributions. More information about these projects can be found at:

- [QLoRA](https://github.com/artidoro/qlora)
- [bitsandbytes](https://github.com/TimDettmers/bitsandbytes)


Thank you all for your commitment to innovation and for making these projects possible.