Initial GGML model commit
Browse files
README.md
CHANGED
@@ -51,12 +51,12 @@ Q&A Example
|
|
51 |
```
|
52 |
Question: {prompt}
|
53 |
Answer:
|
54 |
-
|
55 |
|
56 |
|
57 |
An example of how it handles different roles, which I still like to use explicit instructions for:
|
58 |
|
59 |
-
|
60 |
### Instruction
|
61 |
Complete the story in a manner that accurately reflects the scenario summary.
|
62 |
|
@@ -96,7 +96,7 @@ Refer to the Provided Files table below to see what files use which methods, and
|
|
96 |
| Name | Quant method | Bits | Size | Max RAM required | Use case |
|
97 |
| ---- | ---- | ---- | ---- | ---- | ----- |
|
98 |
| [llama2-22b-daydreamer-v2.ggmlv3.q2_K.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q2_K.bin) | q2_K | 2 | 9.22 GB| 11.72 GB | New k-quant method. Uses GGML_TYPE_Q4_K for the attention.vw and feed_forward.w2 tensors, GGML_TYPE_Q2_K for the other tensors. |
|
99 |
-
| [llama2-22b-daydreamer-v2.ggmlv3.q3_K_L.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q3_K_L.bin) | q3_K_L | 3 |
|
100 |
| [llama2-22b-daydreamer-v2.ggmlv3.q3_K_M.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q3_K_M.bin) | q3_K_M | 3 | 10.57 GB| 13.07 GB | New k-quant method. Uses GGML_TYPE_Q4_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |
|
101 |
| [llama2-22b-daydreamer-v2.ggmlv3.q3_K_S.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q3_K_S.bin) | q3_K_S | 3 | 9.46 GB| 11.96 GB | New k-quant method. Uses GGML_TYPE_Q3_K for all tensors |
|
102 |
| [llama2-22b-daydreamer-v2.ggmlv3.q4_0.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q4_0.bin) | q4_0 | 4 | 12.34 GB| 14.84 GB | Original quant method, 4-bit. |
|
|
|
51 |
```
|
52 |
Question: {prompt}
|
53 |
Answer:
|
54 |
+
``
|
55 |
|
56 |
|
57 |
An example of how it handles different roles, which I still like to use explicit instructions for:
|
58 |
|
59 |
+
````
|
60 |
### Instruction
|
61 |
Complete the story in a manner that accurately reflects the scenario summary.
|
62 |
|
|
|
96 |
| Name | Quant method | Bits | Size | Max RAM required | Use case |
|
97 |
| ---- | ---- | ---- | ---- | ---- | ----- |
|
98 |
| [llama2-22b-daydreamer-v2.ggmlv3.q2_K.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q2_K.bin) | q2_K | 2 | 9.22 GB| 11.72 GB | New k-quant method. Uses GGML_TYPE_Q4_K for the attention.vw and feed_forward.w2 tensors, GGML_TYPE_Q2_K for the other tensors. |
|
99 |
+
| [llama2-22b-daydreamer-v2.ggmlv3.q3_K_L.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q3_K_L.bin) | q3_K_L | 3 | 11.61 GB| 14.11 GB | New k-quant method. Uses GGML_TYPE_Q5_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |
|
100 |
| [llama2-22b-daydreamer-v2.ggmlv3.q3_K_M.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q3_K_M.bin) | q3_K_M | 3 | 10.57 GB| 13.07 GB | New k-quant method. Uses GGML_TYPE_Q4_K for the attention.wv, attention.wo, and feed_forward.w2 tensors, else GGML_TYPE_Q3_K |
|
101 |
| [llama2-22b-daydreamer-v2.ggmlv3.q3_K_S.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q3_K_S.bin) | q3_K_S | 3 | 9.46 GB| 11.96 GB | New k-quant method. Uses GGML_TYPE_Q3_K for all tensors |
|
102 |
| [llama2-22b-daydreamer-v2.ggmlv3.q4_0.bin](https://huggingface.co/TheBloke/llama2-22B-daydreamer-v2-GGML/blob/main/llama2-22b-daydreamer-v2.ggmlv3.q4_0.bin) | q4_0 | 4 | 12.34 GB| 14.84 GB | Original quant method, 4-bit. |
|