WizardLM 13B GGML

This is https://huggingface.co/winddude/wizardLM-LlaMA-LoRA-13 converted into GGML format, supported by llama.cpp.

If used in an instruct mode this model is very picky about the newlines. For some reason it insists on having the following format:

### Instruction:

(instruction text here)

### Response:

(leave this for the model to fill)

Note that generally the alpacas only have one newline before '### Instruction:' / '### Response' and the text. I might have somehow messed up conversion :(

Instructions for running this model in https://github.com/oobabooga/text-generation-webui:

Make sure the pytorch and llama.cpp are recent enough:

pip install -U --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
pip install -U llamacpp

Move the model to the right directory:

mkdir models/wizardlm-13b/
mv ~/Downloads/ggml-model-q5_1.bin models/wizardlm-13b/

Run the textgen, e.g. with:

python server.py --cpu --threads 8 --chat --model=wizardlm-13b --model_type=llama  --verbose

If using it in the instruction mode, you should also fix the prompt. Select the Alpaca template as a base, then go to 'Character' tab and modify Turn template with the following string: <|user|>\n\n<|user-message|>\n\n<|bot|>\n\n<|bot-message|>\n\n.
Sometimes the model just won't shut up, so go to Parameters tab and add the following 'Custom stopping string': "### Instruction:"
The 3) and 4) could probably be fixed by modifying settings.json

Please share other tips and tricks in https://huggingface.co/execveat/wizardLM-13b-ggml-q5_1/discussions