README.md · jondurbin/airoboros-65b-gpt4-1.2 at 48236b7fab2e2c38a2ae18022f8869e248cc8c52

metadata

license: other
datasets:
  - jondurbin/airoboros-gpt4-1.2

Overview

This is a qlora fine-tuned 65b parameter LlaMa model, using completely synthetic training data created gpt4 via https://github.com/jondurbin/airoboros

This is mostly an extension of 1.1, but with a 65b model and thousands of new training data and an update to allow "PLAINFORMAT" at the end of coding prompts to just print the code without backticks or explanations/usage/etc.

The dataset used to fine-tune this model is available here, with a specific focus on:

coding
math/reasoning (using orca style ELI5 instruction/response pairs)
trivia
role playing
multiple choice and fill-in-the-blank
context-obedient question answering
theory of mind
misc/general

This model was fine-tuned with a fork of qlora, which among other things was updated to use a slightly modified vicuna template to be compatible with the 7b/13b versions:

A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user's input. USER: [prompt] ASSISTANT:

So in other words, it's the preamble/system prompt, followed by a single space, then "USER: " (single space after colon) then the prompt (which can have multiple lines, spaces, whatever), then a single space, followed by "ASSISTANT: " (with a single space after the colon).

Usage

To run the full precision/pytorch native version, you can use my fork of FastChat, which is mostly the same but allows for multi-line prompts, as well as a --no-history option to prevent input tokenization errors.

pip install git+https://github.com/jondurbin/FastChat

Be sure you are pulling the latest branch!

Then, you can invoke it like so (after downloading the model):

python -m fastchat.serve.cli \
  --model-path airoboros-65b-gpt4-1.2 \
  --temperature 0.5 \
  --max-new-tokens 2048 \
  --no-history

Alternatively, please check out TheBloke's quantized versions:

Coding updates from gpt4/1.1:

I added a few hundred instruction/response pairs to the training data with "PLAINFORMAT" as a single, all caps term at the end of the normal instructions, which produce plain text output instead of markdown/backtick code formatting.

It's not guaranteed to work all the time, but mostly it does seem to work as expected.

So for example, instead of:

Implement the Snake game in python.

You would use:

Implement the Snake game in python.  PLAINFORMAT

Other updates from gpt4/1.1:

Several hundred role-playing data.
A few thousand ORCA style reasoning/math questions with ELI5 prompts to generate the responses (should not be needed in your prompts to this model however, just ask the question).
Many more coding examples in various languages, including some that use specific libraries (pandas, numpy, tensorflow, etc.)