TheBloke
/

Chronoboros-33B-GPTQ

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

TheBloke commited on Jul 10, 2023

Commit

590a500

•

1 Parent(s): 13b7979

Update README.md

Files changed (1) hide show

README.md +5 -3

README.md CHANGED Viewed

@@ -85,10 +85,11 @@ model = AutoGPTQForCausalLM.from_quantized(model_name_or_path,
         quantize_config=None)
 prompt = "Tell me about AI"
-prompt_template=f'''A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
-USER: {prompt}
-ASSISTANT:'''
 print("\n\n*** Generate:")
@@ -128,6 +129,7 @@ It was created without group_size to lower VRAM requirements, and with --act-ord
 * `chronoboros-33b-GPTQ-4bit--1g.act.order.safetensors`
   * Works with [ExLlama](https://github.com/turboderp/exllama), providing the best performance and lowest VRAM usage. Recommended.
   * Works with AutoGPTQ in CUDA or Triton modes.
   * Works with GPTQ-for-LLaMa in CUDA mode.  May have issues with GPTQ-for-LLaMa Triton mode.
   * Works with text-generation-webui, including one-click-installers.
   * Parameters: Groupsize = -1. Act Order / desc_act = True.

         quantize_config=None)
 prompt = "Tell me about AI"
+prompt_template=f'''Below is an instruction that describes a task. Write a response that appropriately completes the request.
+### Instruction: {prompt}
+### Response:```
 print("\n\n*** Generate:")
 * `chronoboros-33b-GPTQ-4bit--1g.act.order.safetensors`
   * Works with [ExLlama](https://github.com/turboderp/exllama), providing the best performance and lowest VRAM usage. Recommended.
   * Works with AutoGPTQ in CUDA or Triton modes.
+  * Works with [Occ4m's GPTQ-for-LLaMa fork](https://github.com/0cc4m/GPTQ-for-LLaMa).
   * Works with GPTQ-for-LLaMa in CUDA mode.  May have issues with GPTQ-for-LLaMa Triton mode.
   * Works with text-generation-webui, including one-click-installers.
   * Parameters: Groupsize = -1. Act Order / desc_act = True.