digitalpipelines
/

llama2_7b_chat_uncensored

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

digitalpipelines commited on Aug 18, 2023

Commit

4035a0c

·

1 Parent(s): f78c64d

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -8,7 +8,8 @@ datasets:
 Fine-tuned [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b) with an uncensored/unfiltered Wizard-Vicuna conversation dataset [digitalpipelines/wizard_vicuna_70k_uncensored](https://huggingface.co/datasets/digitalpipelines/wizard_vicuna_70k_uncensored).
 Used QLoRA for fine-tuning using the process outlined in https://georgesung.github.io/ai/qlora-ift/
-A quantized GPTQ model can be found at [digitalpipelines/llama2_7b_chat_uncensored-GPTQ](https://huggingface.co/digitalpipelines/llama2_7b_chat_uncensored-GPTQ)
 # Prompt style
 The model was trained with the following prompt style:

 Fine-tuned [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b) with an uncensored/unfiltered Wizard-Vicuna conversation dataset [digitalpipelines/wizard_vicuna_70k_uncensored](https://huggingface.co/datasets/digitalpipelines/wizard_vicuna_70k_uncensored).
 Used QLoRA for fine-tuning using the process outlined in https://georgesung.github.io/ai/qlora-ift/
+- GPTQ quantized model can be found at [digitalpipelines/llama2_7b_chat_uncensored-GPTQ](https://huggingface.co/digitalpipelines/llama2_7b_chat_uncensored-GPTQ)
+- GGML 2, 3, 4, 5, 6 and 8-bit quanitized models for CPU+GPU inference of [digitalpipelines/llama2_7b_chat_uncensored-GGML](https://huggingface.co/digitalpipelines/llama2_7b_chat_uncensored-GGML)
 # Prompt style
 The model was trained with the following prompt style: