Update README.md
Browse files
README.md
CHANGED
@@ -8,33 +8,43 @@ pipeline_tag: text-generation
|
|
8 |
developers: Kanana LLM
|
9 |
training_regime: bf16 mixed precision
|
10 |
base_model:
|
11 |
-
-
|
12 |
tags:
|
13 |
- abliterated
|
14 |
- uncensored
|
15 |
---
|
|
|
16 |
|
17 |
-
|
18 |
|
|
|
19 |
|
20 |
-
This is an uncensored version of [kakaocorp/kanana-nano-2.1b-instruct](https://huggingface.co/kakaocorp/kanana-nano-2.1b-instruct) created with abliteration (see [remove-refusals-with-transformers](https://github.com/Sumandora/remove-refusals-with-transformers) to know more about it).
|
21 |
-
This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
|
22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
ollama run huihui_ai/kanana-nano-abliterated
|
29 |
-
```
|
30 |
-
|
31 |
-
### Donation
|
32 |
-
|
33 |
-
If you like it, please click 'like' and follow us for more updates.
|
34 |
-
You can follow [x.com/support_huihui](https://x.com/support_huihui) to get the latest model information from huihui.ai.
|
35 |
-
|
36 |
-
##### Your donation helps us continue our further development and improvement, a cup of coffee can do it.
|
37 |
-
- bitcoin:
|
38 |
-
```
|
39 |
-
bc1qqnkhuchxw0zqjh2ku3lu4hq45hc6gy84uk70ge
|
40 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
developers: Kanana LLM
|
9 |
training_regime: bf16 mixed precision
|
10 |
base_model:
|
11 |
+
- huihui-ai/kanana-nano-2.1b-instruct-abliterated
|
12 |
tags:
|
13 |
- abliterated
|
14 |
- uncensored
|
15 |
---
|
16 |
+
# Melvin56/kanana-nano-2.1b-instruct-abliterated-GGUF
|
17 |
|
18 |
+
Original Model : [huihui-ai/kanana-nano-2.1b-instruct-abliterated](https://huggingface.co/huihui-ai/kanana-nano-2.1b-instruct-abliterated)
|
19 |
|
20 |
+
All quants are made using the imatrix dataset.
|
21 |
|
|
|
|
|
22 |
|
23 |
+
| Model | Size (GB) |
|
24 |
+
|:-------------------------------------------------|:-------------:|
|
25 |
+
| Q2_K_S | 0.914 |
|
26 |
+
| Q2_K | 0.931 |
|
27 |
+
| Q3_K_M | 1.138 |
|
28 |
+
| Q4_K_M | 1.385 |
|
29 |
+
| Q5_K_M | 1.568 |
|
30 |
+
| Q6_K | 1.826 |
|
31 |
+
| Q8_0 | 2.223 |
|
32 |
+
| F16 | 4.177 |
|
33 |
+
| F32 | 8.342 |
|
34 |
|
35 |
+
| | CPU (AVX2) | CPU (ARM NEON) | Metal | cuBLAS | rocBLAS | SYCL | CLBlast | Vulkan | Kompute |
|
36 |
+
| :------------ | :---------: | :------------: | :---: | :----: | :-----: | :---: | :------: | :----: | :------: |
|
37 |
+
| K-quants | β
| β
| β
| β
| β
| β
| β
π’5 | β
π’5 | β |
|
38 |
+
| I-quants | β
π’4 | β
π’4 | β
π’4 | β
| β
| PartialΒΉ | β | β | β |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
```
|
40 |
+
β
: feature works
|
41 |
+
π«: feature does not work
|
42 |
+
β: unknown, please contribute if you can test it youself
|
43 |
+
π’: feature is slow
|
44 |
+
ΒΉ: IQ3_S and IQ1_S, see #5886
|
45 |
+
Β²: Only with -ngl 0
|
46 |
+
Β³: Inference is 50% slower
|
47 |
+
β΄: Slower than K-quants of comparable size
|
48 |
+
β΅: Slower than cuBLAS/rocBLAS on similar cards
|
49 |
+
βΆ: Only q8_0 and iq4_nl
|
50 |
+
```
|