File size: 5,005 Bytes
6da6fe6
c1bed9d
6da6fe6
 
 
 
 
 
 
 
94a920d
6da6fe6
94a920d
 
2fb7058
94a920d
 
 
 
 
 
6da6fe6
 
110e177
 
6da6fe6
 
2fb7058
fd57b97
0d6b15b
 
2e7fd30
1ece0cb
0d6b15b
cbc9a3e
1ece0cb
0d6b15b
cbc9a3e
0d6b15b
2e7fd30
0d6b15b
 
c4fe375
fd57b97
aca3d1d
7333c42
cbc9a3e
c48d3a9
 
 
6da6fe6
43a010a
 
110e177
43a010a
110e177
3b9837f
 
 
c1bed9d
 
 
 
 
73728fc
 
 
 
 
 
6da6fe6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
base_model: cognitivecomputations/Samantha-1.11-70b
datasets:
- ehartford/samantha-data
language:
- en
library_name: transformers
license: llama2
quantized_by: mradermacher
---
## About

weighted/imatrix quants of https://huggingface.co/cognitivecomputations/Samantha-1.11-70b

<!-- provided-files -->
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ1_S.gguf) | i1-IQ1_S | 15.0 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ1_M.gguf) | i1-IQ1_M | 16.4 | mostly desperate |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 18.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_XS.gguf) | i1-IQ2_XS | 20.8 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_S.gguf) | i1-IQ2_S | 21.8 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ2_M.gguf) | i1-IQ2_M | 23.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q2_K.gguf) | i1-Q2_K | 25.9 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 27.4 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_XS.gguf) | i1-IQ3_XS | 28.6 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_XS.gguf) | i1-Q3_K_XS | 28.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_S.gguf) | i1-IQ3_S | 30.3 | beats Q3_K* |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_S.gguf) | i1-Q3_K_S | 30.3 | IQ3_XS probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ3_M.gguf) | i1-IQ3_M | 31.4 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_M.gguf) | i1-Q3_K_M | 33.7 | IQ3_S probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q3_K_L.gguf) | i1-Q3_K_L | 36.6 | IQ3_M probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ4_XS.gguf) | i1-IQ4_XS | 37.2 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-IQ4_NL.gguf) | i1-IQ4_NL | 39.4 | prefer IQ4_XS |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q4_0.gguf) | i1-Q4_0 | 39.4 | fast, low quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q4_K_S.gguf) | i1-Q4_K_S | 39.7 | optimal size/speed/quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q4_K_M.gguf) | i1-Q4_K_M | 41.8 | fast, recommended |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q5_K_S.gguf) | i1-Q5_K_S | 47.9 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q5_K_M.gguf) | i1-Q5_K_M | 49.2 |  |
| [PART 1](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q6_K.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Samantha-1.11-70b-i1-GGUF/resolve/main/Samantha-1.11-70b.i1-Q6_K.gguf.part2of2) | i1-Q6_K | 57.0 | practically like static Q6_K |

Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## FAQ / Model Request

See https://huggingface.co/mradermacher/model_requests for some answers to
questions you might have and/or if you want some other model quantized.

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.

<!-- end -->