airoboros-13B-GPTQ / README.md
TheBloke's picture
Update for Transformers GPTQ support
6c0b715
metadata
license: other
inference: false
TheBlokeAI

TheBloke's LLM work is generously supported by a grant from andreessen horowitz (a16z)


Airoboros 13B GPTQ 4bit

These files are GPTQ 4bit model files for Jon Durbin's Airoboros 13B.

It is the result of quantising to 4bit using GPTQ-for-LLaMa.

Other repositories available

How to easily download and use this model in text-generation-webui

Open the text-generation-webui UI as normal.

  1. Click the Model tab.
  2. Under Download custom model or LoRA, enter TheBloke/Airoboros-13B-GPTQ.
  3. Click Download.
  4. Wait until it says it's finished downloading.
  5. Click the Refresh icon next to Model in the top left.
  6. In the Model drop-down: choose the model you just downloaded, Airoboros-13B-GPTQ.
  7. If you see an error in the bottom right, ignore it - it's temporary.
  8. Fill out the GPTQ parameters on the right: Bits = 4, Groupsize = 128, model_type = Llama
  9. Click Save settings for this model in the top right.
  10. Click Reload the Model in the top right.
  11. Once it says it's loaded, click the Text Generation tab and enter a prompt!

Provided files

Compatible file - Airoboros-13B-GPTQ-4bit-128g.no-act-order.safetensors

In the main branch - the default one - you will find Airoboros-13B-GPTQ-4bit-128g.no-act-order.safetensors

This will work with all versions of GPTQ-for-LLaMa. It has maximum compatibility.

It was created without the --act-order parameter to ensure full compatibility.

  • wizard-vicuna-13B-GPTQ-4bit.compat.no-act-order.safetensors
    • Works with all versions of GPTQ-for-LLaMa code, both Triton and CUDA branches
    • Works with AutoGPTQ.
    • Works with text-generation-webui one-click-installers
    • Parameters: Groupsize = 128. No act-order.
    • Command used to create the GPTQ:
      python llama.py /workspace/models/jondurbin_airoboros-13b  wikitext2 --wbits 4 --true-sequential --groupsize 128  --save_safetensors /workspace/jon-13b/gptq/Airoboros-13B-GPTQ-4bit-128g.no-act-order.safetensors
      

Discord

For further support, and discussions on these models and AI in general, join us at:

TheBloke AI's Discord server

Thanks, and how to contribute.

Thanks to the chirper.ai team!

I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training.

If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects.

Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits.

Special thanks to: Aemon Algiz.

Patreon special mentions: Sam, theTransient, Jonathan Leane, Steven Wood, webtim, Johann-Peter Hartmann, Geoffrey Montalvo, Gabriel Tamborski, Willem Michiel, John Villwock, Derek Yates, Mesiah Bishop, Eugene Pentland, Pieter, Chadd, Stephen Murray, Daniel P. Andersen, terasurfer, Brandon Frisco, Thomas Belote, Sid, Nathan LeClaire, Magnesian, Alps Aficionado, Stanislav Ovsiannikov, Alex, Joseph William Delisle, Nikolai Manek, Michael Davis, Junyu Yang, K, J, Spencer Kim, Stefan Sabev, Olusegun Samson, transmissions 11, Michael Levine, Cory Kujawski, Rainer Wilmers, zynix, Kalila, Luke @flexchar, Ajan Kanaga, Mandus, vamX, Ai Maven, Mano Prime, Matthew Berman, subjectnull, Vitor Caleffi, Clay Pascal, biorpg, alfie_i, 阿明, Jeffrey Morgan, ya boyyy, Raymond Fosdick, knownsqashed, Olakabola, Leonard Tan, ReadyPlayerEmma, Enrico Ros, Dave, Talal Aujan, Illia Dulskyi, Sean Connelly, senxiiz, Artur Olbinski, Elle, Raven Klaugh, Fen Risland, Deep Realms, Imad Khwaja, Fred von Graf, Will Dee, usrbinkat, SuperWojo, Alexandros Triantafyllidis, Swaroop Kallakuri, Dan Guido, John Detwiler, Pedro Madruga, Iucharbius, Viktor Bowallius, Asp the Wyvern, Edmond Seymore, Trenton Dambrowitz, Space Cruiser, Spiking Neurons AB, Pyrater, LangChain4j, Tony Hughes, Kacper Wikieł, Rishabh Srivastava, David Ziegler, Luke Pendergrass, Andrey, Gabriel Puliatti, Lone Striker, Sebastain Graf, Pierre Kircher, Randy H, NimbleBox.ai, Vadim, danny, Deo Leter

Thank you to all my generous patrons and donaters!

And thank you again to a16z for their generous grant.

Airoboros-13B original model card

Overview

This is a fine-tuned 13b parameter LlaMa model, using completely synthetic training data created by https://github.com/jondurbin/airoboros

Eval (gpt4 judging)

chart

model raw score gpt-3.5 adjusted score
airoboros-13b 17947 98.087
gpt35 18297 100.0
gpt4-x-alpasta-30b 15612 85.33
manticore-13b 15856 86.66
vicuna-13b-1.1 16306 89.12
wizard-vicuna-13b-uncensored 16287 89.01
individual question scores, with shareGPT links (200 prompts generated by gpt-4)

wb-13b-u is Wizard-Vicuna-13b-Uncensored

airoboros-13b gpt35 gpt4-x-alpasta-30b manticore-13b vicuna-13b-1.1 wv-13b-u link
80 95 70 90 85 60 eval
20 95 40 30 90 80 eval
100 100 100 95 95 100 eval
90 100 85 60 95 100 eval
95 90 80 85 95 75 eval
100 95 90 95 98 92 eval
50 100 80 95 60 55 eval
70 90 80 60 85 40 eval
100 95 50 85 40 60 eval
85 60 55 65 50 70 eval
95 100 85 90 60 75 eval
100 95 70 80 50 85 eval
100 95 80 70 60 90 eval
95 100 70 85 90 90 eval
80 95 90 60 30 85 eval
60 95 0 75 50 40 eval
100 95 90 98 95 95 eval
60 85 40 50 20 0 eval
100 90 85 95 95 80 eval
100 95 100 95 90 95 eval
95 90 96 80 92 88 eval
95 92 90 93 89 91 eval
95 93 90 94 96 92 eval
95 90 93 88 92 85 eval
95 90 85 96 88 92 eval
95 95 90 93 92 91 eval
95 98 80 97 99 96 eval
95 93 90 87 92 89 eval
90 85 95 80 92 75 eval
90 85 95 93 80 92 eval
95 92 90 91 93 89 eval
100 95 90 85 80 95 eval
95 97 93 92 96 94 eval
95 93 94 90 88 92 eval
90 95 98 85 96 92 eval
90 88 85 80 82 84 eval
90 95 85 87 92 88 eval
95 97 96 90 93 92 eval
95 93 92 90 89 91 eval
90 95 93 92 94 91 eval
90 85 95 80 88 75 eval
85 90 95 88 92 80 eval
90 95 92 85 80 87 eval
85 90 95 80 88 75 eval
85 80 75 90 70 82 eval
90 85 95 92 93 80 eval
90 95 75 85 80 70 eval
85 90 80 88 82 83 eval
85 90 95 92 88 80 eval
85 90 80 75 95 88 eval
85 90 80 88 84 92 eval
80 90 75 85 70 95 eval
90 88 85 80 92 83 eval
85 75 90 80 78 88 eval
85 90 80 82 75 88 eval
90 85 40 95 80 88 eval
85 95 90 75 88 80 eval
85 95 90 92 89 88 eval
80 85 75 60 90 70 eval
85 90 87 80 88 75 eval
85 80 75 50 90 80 eval
95 80 90 85 75 82 eval
85 90 80 70 95 88 eval
90 95 70 85 80 75 eval
90 85 70 75 80 60 eval
95 90 70 50 85 80 eval
80 85 40 60 90 95 eval
75 60 80 55 70 85 eval
90 85 60 50 80 95 eval
45 85 60 20 65 75 eval
85 90 30 60 80 70 eval
90 95 80 40 85 70 eval
85 90 70 75 80 95 eval
90 70 50 20 60 40 eval
90 95 75 60 85 80 eval
85 80 60 70 65 75 eval
90 85 80 75 82 70 eval
90 95 80 70 85 75 eval
85 75 30 80 90 70 eval
85 90 50 70 80 60 eval
100 95 98 99 97 96 eval
95 90 92 93 91 89 eval
95 92 90 85 88 91 eval
100 95 98 97 96 99 eval
100 100 100 90 100 95 eval
100 95 98 97 94 99 eval
95 90 92 93 94 91 eval
100 95 98 90 96 95 eval
95 96 92 90 89 93 eval
100 95 93 90 92 88 eval
100 100 98 97 99 100 eval
95 90 92 85 93 94 eval
95 93 90 92 96 91 eval
95 96 92 90 93 91 eval
95 90 92 93 91 89 eval
100 98 95 97 96 99 eval
90 95 85 88 92 87 eval
95 93 90 92 89 88 eval
100 95 97 90 96 94 eval
95 93 90 92 94 91 eval
95 92 90 93 94 88 eval
95 92 60 97 90 96 eval
95 90 92 93 91 89 eval
95 90 97 92 91 93 eval
90 95 93 85 92 91 eval
95 90 40 92 93 85 eval
100 100 95 90 95 90 eval
90 95 96 98 93 92 eval
90 95 92 89 93 94 eval
100 95 100 98 96 99 eval
100 100 95 90 100 90 eval
90 85 88 92 87 91 eval
95 97 90 92 93 94 eval
90 95 85 88 92 89 eval
95 93 90 92 94 91 eval
90 95 85 80 88 82 eval
95 90 60 85 93 70 eval
95 92 94 93 96 90 eval
95 90 85 93 87 92 eval
95 96 93 90 97 92 eval
100 0 0 100 0 0 eval
60 100 0 80 0 0 eval
0 100 60 0 0 90 eval
100 100 0 100 100 100 eval
100 100 100 100 95 100 eval
100 100 100 50 90 100 eval
100 100 100 100 95 90 eval
100 100 100 95 0 100 eval
50 95 20 10 30 85 eval
100 100 60 20 30 40 eval
100 0 0 0 0 100 eval
0 100 60 0 0 80 eval
50 100 20 90 0 10 eval
100 100 100 100 100 100 eval
100 100 100 100 100 100 eval
40 100 95 0 100 40 eval
100 100 100 100 80 100 eval
100 100 100 0 90 40 eval
0 100 100 50 70 20 eval
100 100 50 90 0 95 eval
100 95 90 85 98 80 eval
95 98 90 92 96 89 eval
90 95 75 85 80 82 eval
95 98 50 92 96 94 eval
95 90 0 93 92 94 eval
95 90 85 92 80 88 eval
95 93 75 85 90 92 eval
90 95 88 85 92 89 eval
100 100 100 95 97 98 eval
85 40 30 95 90 88 eval
90 95 92 85 88 93 eval
95 96 92 90 89 93 eval
90 95 85 80 92 88 eval
95 98 65 90 85 93 eval
95 92 96 97 90 89 eval
95 90 92 91 89 93 eval
95 90 80 75 95 90 eval
92 40 30 95 90 93 eval
90 92 85 88 89 87 eval
95 80 90 92 91 88 eval
95 93 92 90 91 94 eval
100 98 95 90 92 96 eval
95 92 80 85 90 93 eval
95 98 90 88 97 96 eval
90 95 85 88 86 92 eval
100 100 100 100 100 100 eval
90 95 85 96 92 88 eval
100 98 95 99 97 96 eval
95 92 70 90 93 89 eval
95 90 88 92 94 93 eval
95 90 93 92 85 94 eval
95 93 90 87 92 91 eval
95 93 90 96 92 91 eval
95 97 85 96 98 90 eval
95 92 90 85 93 94 eval
95 96 92 90 97 93 eval
95 93 96 94 90 92 eval
95 94 93 92 90 89 eval
90 85 95 80 87 75 eval
95 94 92 93 90 96 eval
95 100 90 95 95 95 eval
100 95 85 100 0 90 eval
100 95 90 95 100 95 eval
95 90 60 95 85 80 eval
100 95 90 98 97 99 eval
95 90 85 95 80 92 eval
100 95 100 98 100 90 eval
100 95 80 85 90 85 eval
100 90 95 85 95 100 eval
95 90 85 80 88 92 eval
100 100 0 0 100 0 eval
100 100 100 50 100 75 eval
100 100 0 0 100 0 eval
0 100 0 0 0 0 eval
100 100 50 0 0 0 eval
100 100 100 100 100 95 eval
100 100 50 0 0 0 eval
100 100 0 0 100 0 eval
90 85 80 95 70 75 eval
100 100 0 0 0 0 eval

Training data

I used a jailbreak prompt to generate the synthetic instructions, which resulted in some training data that would likely be censored by other models, such as how-to prompts about synthesizing drugs, making homemade flamethrowers, etc. Mind you, this is all generated by ChatGPT, not me. My goal was to simply test some of the capabilities of ChatGPT when unfiltered (as much as possible), and not to intentionally produce any harmful/dangerous/etc. content.

The jailbreak prompt I used is the default prompt in the python code when using the --uncensored flag: https://github.com/jondurbin/airoboros/blob/main/airoboros/self_instruct.py#L39

I also did a few passes of manually cleanup to remove some bad prompts, but mostly I left the data as-is. Initially, the model was fairly bad at math/extrapolation, closed question-answering (heavy hallucination), and coding, so I did one more fine tuning pass with additional synthetic instructions aimed at those types of problems.

Both the initial instructions and final-pass fine-tuning instructions will be published soon.

Fine-tuning method

I used the excellent FastChat module, running with:

source /workspace/venv/bin/activate

export NCCL_P2P_DISABLE=1
export NCCL_P2P_LEVEL=LOC

torchrun --nproc_per_node=8 --master_port=20001 /workspace/FastChat/fastchat/train/train_mem.py \
  --model_name_or_path /workspace/llama-13b \
  --data_path /workspace/as_conversations.json \
  --bf16 True \
  --output_dir /workspace/airoboros-uncensored-13b \
  --num_train_epochs 3 \
  --per_device_train_batch_size 20 \
  --per_device_eval_batch_size 20 \
  --gradient_accumulation_steps 2 \
  --evaluation_strategy "steps" \
  --eval_steps 500 \
  --save_strategy "steps" \
  --save_steps 500 \
  --save_total_limit 10 \
  --learning_rate 2e-5 \
  --weight_decay 0. \
  --warmup_ratio 0.04 \
  --lr_scheduler_type "cosine" \
  --logging_steps 1 \
  --fsdp "full_shard auto_wrap offload" \
  --fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
  --tf32 True \
  --model_max_length 2048 \
  --gradient_checkpointing True \
  --lazy_preprocess True

This ran on 8x nvidia 80gb a100's for about 40 hours.

train/loss

eval/loss

Prompt format

The prompt should be 1:1 compatible with the FastChat/vicuna format, e.g.:

With a preamble:

A chat between a curious user and an artificial intelligence assistant.  The assistant gives helpful, detailed, and polite answers to the user's questions.

USER: [prompt]
<\s>

ASSISTANT:

Or just:

USER: [prompt]
<\s>

ASSISTANT:

License

The model is licensed under the LLaMA model, and the dataset is licensed under the terms of OpenAI because it uses ChatGPT. Everything else is free.