|
--- |
|
tags: |
|
- manticore |
|
- guanaco |
|
- uncensored |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
--- |
|
--- |
|
# GGML of: |
|
Manticore-13b-Chat-Pyg by [openaccess-ai-collective](https://huggingface.co/openaccess-ai-collective/manticore-13b-chat-pyg) with the Guanaco 13b qLoRa by [TimDettmers](https://huggingface.co/timdettmers/guanaco-13b) applied through [Monero](https://huggingface.co/Monero/Manticore-13b-Chat-Pyg-Guanaco), quantized by [mindrage](https://huggingface.co/mindrage), uncensored |
|
|
|
12.06.2023: Added versions quantized with the new method (less precision loss relative to compression ratio, but slower (for now)): |
|
q2_K, q3_KM, q4_KS, q4_KM, q5_KS |
|
|
|
Old Quant method: |
|
q4_0, q5_0 and q8_0 versions available |
|
|
|
[link to GPTQ Version](https://huggingface.co/mindrage/Manticore-13B-Chat-Pyg-Guanaco-GPTQ-4bit-128g.no-act-order.safetensors) |
|
|
|
--- |
|
|
|
|
|
Files are quantized using the newest llama.cpp and will therefore only work with llama.cpp versions compiled after May 19th, 2023. |
|
|
|
|
|
The model seems to have noticeably benefited from further augmentation with the Guanaco qLora. |
|
Its capabilities seem broad, even compared with other Wizard or Manticore models, with expected weaknesses at coding. It is very good at in-context-learning and (in its class) reasoning. |
|
It both follows instructions well, and can be used as a chatbot. |
|
Refreshingly, it does not seem to insist on aggressively sticking to narratives to justify formerly hallucinated output as much as similar models. It's output seems... eerily smart at times. |
|
I believe the model is fully unrestricted/uncensored and will generally not berate. |
|
|
|
--- |
|
|
|
Prompting style + settings: |
|
--- |
|
Presumably due to the very diverse training-data the model accepts a variety of prompting styles with relatively few issues, including the ###-Variant, but seems to work best using: |
|
# "Naming" the model works great by simply modifying the context. Substantial changes in its behaviour can be caused by appending to "ASSISTANT:", like "ASSISTANT: After careful consideration, thinking step-by-step, my response is:" |
|
|
|
user: "USER:" - |
|
bot: "ASSISTANT:" - |
|
context: "This is a conversation between an advanced AI and a human user." |
|
|
|
Turn Template: <|user|> <|user-message|>\n<|bot|><|bot-message|>\n |
|
|
|
Settings that work well without (subjectively) being too deterministic: |
|
|
|
temp: 0.15 - |
|
top_p: 0.1 - |
|
top_k: 40 - |
|
rep penalty: 1.1 |
|
--- |