Overview

Fine-tuned Llama-2 13B trained with an uncensored/unfiltered Wizard-Vicuna conversation dataset digitalpipelines/wizard_vicuna_70k_uncensored. QLoRA was used for fine-tuning.

Available versions of this model

GPTQ model for usage with GPU. Multiple quantisation options available.
Various GGML model quantization sizesfor CPU/GPU/Apple M1 usage.
Original unquantised model

Prompt template: Llama-2-Chat

SYSTEM: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
USER: {prompt}
ASSISTANT: