RachidAR commited on
Commit
40b6651
1 Parent(s): 6f906f0

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ inference: false
3
+ license: other
4
+ ---
5
+ # Monero's WizardLM-Uncensored-SuperCOT-Storytelling-30B GGML
6
+ These files are GGML format model files for [Monero's WizardLM-Uncensored-SuperCOT-Storytelling-30B](https://huggingface.co/Monero/WizardLM-Uncensored-SuperCOT-StoryTelling-30b).
7
+
8
+ # Works only with PR 1684: https://github.com/ggerganov/llama.cpp/pull/1684
9
+
10
+ ## Prompt template
11
+
12
+ ```
13
+ Optional instruction ("You are a helpful assistant" etc)
14
+ USER: prompt
15
+ ASSISTANT:
16
+ ```
17
+
18
+ *The quality of the 3-bit model is higher than the 2-bit model, but the interface is slower. The 3-bit model (type q3_K_S) barely fits into 16 gigabytes of RAM, but it works.*
19
+ ```
20
+ llama_model_load_internal: mem required = 15716.00 MB (+ 3124.00 MB per state)
21
+ ```
22
+ *On my Xeon E3-1225 v3 4/8 old cpu, it runs with ~715 ms per token.*