deinon-daemon/axolotl-13b-chat-qlora-dev

¡ Say hi to Axolotl ! Say hello to axolotl: a small-is-powerful instruct-tuned chat model! This is my second build ever in the fine tuning world. It was hacked in about 48hrs, and was executed entirely on one colab kernel for ~8-9hrs last night (07/29/23) ... enjoy! Test run of Llama-2-13b-chat-hf fine tuned using recently popularized quantized PEFT approach: used Bitsandbytes, --bf16, QLORA, Flash Attn w/ einops and ninja Ampere optimizations, 1 Nvidia A100 GPU for ~9hrs. Fine tuned for 3 epochs on a 40k slice of the Open-Orca dataset, which I postprocessed, added some self-collected contextual qa chat data to, and templated to yield a standard chat instruct prompt format for all examples. Benchmarks at least as good (if not slightly better) than other fine tuned llama/alpaca/guanaco/vicuna models of this scale. The real evaulation/benchmarking is still to come, however, specifically against stabilityai/StableBeluga13B, which seems to be the most popular example of Llama-2 + Open-Orca to date. This is simply a proof of concept (hence the dev tag) -- come back later once we've realeased a model for production.