Blackroot/Llama-2-13B-Storyweaver-LORA-Deprecated

Join the Coffee & AI Discord for AI Stuff and things!

Probably bad model

Test results are showing that although this model does produce long outputs, the quality has generally degraded. I'm leaving this up for the time being but I would recommend one of my other loras instead. As an aside, this model is really, really funny, try it if you want a laugh.

Get the base model here:

Base Model Quantizations by The Bloke here: https://huggingface.co/TheBloke/Llama-2-13B-GGML https://huggingface.co/TheBloke/Llama-2-13B-GPTQ

Prompting for this model:

A brief warning that no alignment or attempts to sanitize or otherwise filter the dataset or the outputs have been done. This is a completelty raw model and may behave unpredictably or create scenarios that are unpleasant.

The base Llama2 is a text completion model. That means it will continue writing from the story in whatever manner you direct it. This is not an instruct tuned model, so don't try and give it instruction.

Correct prompting:

He grabbed his sword, his gleaming armor, he readied himself. The battle was coming, he walked into the dawn light and

Incorrect prompting:

Write a story about...

This model has been trained to generate as much text as possible, so you should use some mechanism to force it to stop at N tokens or something. For exmaple, in one prompt I average about 7000 output tokens, basically make sure you have a max sequence length set or it'll just keep going forever.

Training procedure

22,000 steps @ 7 epochs. Final training loss of 1.8. Total training time was 30 hours on a single 3090 TI.

PEFT:

The following bitsandbytes quantization config was used during training:

load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: fp4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: float32

Framework versions

PEFT 0.5.0.dev0