RachidAR
/

WizardLM-Uncensored-SCOT-ST-30B-Q3_K_S-GGML

Model card Files Files and versions Community

RachidAR commited on Jun 5, 2023

Commit

40b6651

•

1 Parent(s): 6f906f0

Create README.md

Files changed (1) hide show

README.md +22 -0

README.md ADDED Viewed

	@@ -0,0 +1,22 @@

+---
+inference: false
+license: other
+---
+# Monero's WizardLM-Uncensored-SuperCOT-Storytelling-30B GGML
+These files are GGML format model files for [Monero's WizardLM-Uncensored-SuperCOT-Storytelling-30B](https://huggingface.co/Monero/WizardLM-Uncensored-SuperCOT-StoryTelling-30b).
+# Works only with PR 1684: https://github.com/ggerganov/llama.cpp/pull/1684
+## Prompt template
+```
+Optional instruction ("You are a helpful assistant" etc)
+USER: prompt
+ASSISTANT:
+```
+*The quality of the 3-bit model is higher than the 2-bit model, but the interface is slower. The 3-bit model (type q3_K_S) barely fits into 16 gigabytes of RAM, but it works.*
+```
+llama_model_load_internal: mem required  = 15716.00 MB (+ 3124.00 MB per state)
+```
+*On my Xeon E3-1225 v3 4/8 old cpu, it runs with ~715 ms per token.*