Prompt format

#1
by underlines - opened

Thanks for this great merge and quant!

As usual with these merges, mentioning multiple prompt formats :) What is the one that works best for you?

My environment:

  • thebloke/cuda11.8.0-ubuntu22.04-oneclick:latest on runpod
  • 1 x RTX A6000 / 16 vCPU 62 GB RAM
  • ExLlama
    • max_seq_len 4096
    • compress_pos_emb 2
    • LLaMA-Precise (I tried others)
    • Instruction Template: I tried Alpaca + Vicuna v1.1
    • Mode: I tried chat, chat-instruct and instruct

Always gives me gibberish:

image.png

Snipaste_2023-06-27_23-27-57.png
Snipaste_2023-06-27_23-32-38.png

Yep, same thing happened to me for these 30B / 33B SuperHOT models (gibberish output)
Tried:
Guanaco-33B-SuperHOT-8K-GPTQ
WizardLM-33B-V1.0-Uncensored-SuperHOT-8K-GPTQ
Wizard-Vicuna-30B-Superhot-8K-GPTQ

13B SuperHOT models seem working fine.


OS: Ubuntu 22.04
CPU: 32C RAM: 188G
GPU: NVIDIA A10 24G
Driver: 525.105.17
Oobabooga updated to latest version too.

https://huggingface.co/TheBloke/Vicuna-33B-1-1-preview-SuperHOT-8K-GPTQ/discussions/1#649c0950272ee9fd6b635ea3

For people using TheBloke's Runpod Template: It didn't update ExLlama, but it's now fixed. Restart your pods or update ExLlama.

underlines changed discussion status to closed

Sign up or log in to comment