Davidqian123
commited on
Commit
•
ca2c162
1
Parent(s):
5c45428
Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ tags:
|
|
14 |
This repo includes **GGUF** quantized models, for our Octo-planner model at [NexaAIDev/octopus-planning](https://huggingface.co/NexaAIDev/octopus-planning)
|
15 |
|
16 |
|
17 |
-
# GGUF
|
18 |
|
19 |
To run the models, please download them to your local machine using either git clone or [Hugging Face Hub](https://huggingface.co/docs/huggingface_hub/en/guides/download)
|
20 |
```
|
@@ -80,4 +80,30 @@ ollama ls
|
|
80 |
7. Run the mode
|
81 |
```bash
|
82 |
ollama run octopus-planning-Q4_K_M "<|user|>Find my presentation for tomorrow's meeting, connect to the conference room projector via Bluetooth, increase the screen brightness, take a screenshot of the final summary slide, and email it to all participants<|end|><|assistant|>"
|
83 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
This repo includes **GGUF** quantized models, for our Octo-planner model at [NexaAIDev/octopus-planning](https://huggingface.co/NexaAIDev/octopus-planning)
|
15 |
|
16 |
|
17 |
+
# GGUF Quantization
|
18 |
|
19 |
To run the models, please download them to your local machine using either git clone or [Hugging Face Hub](https://huggingface.co/docs/huggingface_hub/en/guides/download)
|
20 |
```
|
|
|
80 |
7. Run the mode
|
81 |
```bash
|
82 |
ollama run octopus-planning-Q4_K_M "<|user|>Find my presentation for tomorrow's meeting, connect to the conference room projector via Bluetooth, increase the screen brightness, take a screenshot of the final summary slide, and email it to all participants<|end|><|assistant|>"
|
83 |
+
```
|
84 |
+
|
85 |
+
|
86 |
+
# Quantized GGUF Models Benchmark
|
87 |
+
|
88 |
+
| Name | Quant method | Bits | Size | Use Cases |
|
89 |
+
| ---------------------- | ------------ | ---- | -------- | ----------------------------------- |
|
90 |
+
| octopus-planning-Q2_K.gguf | Q2_K | 2 | 1.42 GB | fast but high loss, not recommended |
|
91 |
+
| octopus-planning-Q3_K.gguf | Q3_K | 3 | 1.96 GB | extremely not recommended |
|
92 |
+
| octopus-planning-Q3_K_S.gguf | Q3_K_S | 3 | 1.68 GB | extremely not recommended |
|
93 |
+
| octopus-planning-Q3_K_M.gguf | Q3_K_M | 3 | 1.96 GB | moderate loss, not very recommended |
|
94 |
+
| octopus-planning-Q3_K_L.gguf | Q3_K_L | 3 | 2.09 GB | not very recommended |
|
95 |
+
| octopus-planning-Q4_0.gguf | Q4_0 | 4 | 2.18 GB | moderate speed, recommended |
|
96 |
+
| octopus-planning-Q4_1.gguf | Q4_1 | 4 | 2.41 GB | moderate speed, recommended |
|
97 |
+
| octopus-planning-Q4_K.gguf | Q4_K | 4 | 2.39 GB | moderate speed, recommended |
|
98 |
+
| octopus-planning-Q4_K_S.gguf | Q4_K_S | 4 | 2.19 GB | fast and accurate, very recommended |
|
99 |
+
| octopus-planning-Q4_K_M.gguf | Q4_K_M | 4 | 2.39 GB | fast, recommended |
|
100 |
+
| octopus-planning-Q5_0.gguf | Q5_0 | 5 | 2.64 GB | fast, recommended |
|
101 |
+
| octopus-planning-Q5_1.gguf | Q5_1 | 5 | 2.87 GB | very big, prefer Q4 |
|
102 |
+
| octopus-planning-Q5_K.gguf | Q5_K | 5 | 2.82 GB | big, recommended |
|
103 |
+
| octopus-planning-Q5_K_S.gguf | Q5_K_S | 5 | 2.64 GB | big, recommended |
|
104 |
+
| octopus-planning-Q5_K_M.gguf | Q5_K_M | 5 | 2.82 GB | big, recommended |
|
105 |
+
| octopus-planning-Q6_K.gguf | Q6_K | 6 | 3.14 GB | very big, not very recommended |
|
106 |
+
| octopus-planning-Q8_0.gguf | Q8_0 | 8 | 4.06 GB | very big, not very recommended |
|
107 |
+
| octopus-planning-F16.gguf | F16 | 16 | 7.64 GB | extremely big |
|
108 |
+
|
109 |
+
_Quantized with llama.cpp_
|