File size: 5,225 Bytes
a7408f4
ce5803a
a7408f4
 
 
 
 
 
 
 
 
 
 
 
 
 
380534f
a7408f4
380534f
 
36fb743
380534f
 
 
 
 
 
a7408f4
 
354197f
 
a7408f4
 
36fb743
3fe6ae5
5cfbbdb
 
d23d8e3
ca7b0c6
5cfbbdb
2183139
ca7b0c6
5cfbbdb
2183139
5cfbbdb
675a426
5cfbbdb
 
3869b88
3fe6ae5
2915f8d
5738717
2183139
ca7b0c6
 
ba46ee7
a7408f4
83543b1
 
354197f
83543b1
354197f
e648d47
 
 
ce5803a
 
 
 
 
ba46ee7
 
 
 
 
 
a7408f4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
---
base_model: GOAT-AI/GOAT-70B-Storytelling
language:
- en
library_name: transformers
license: llama2
model_type: llama
quantized_by: mradermacher
tags:
- facebook
- meta
- pytorch
- llama
- llama-2
- Storywriter
---
## About

weighted/imatrix quants of https://huggingface.co/GOAT-AI/GOAT-70B-Storytelling

<!-- provided-files -->
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ1_S.gguf) | i1-IQ1_S | 15.0 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ1_M.gguf) | i1-IQ1_M | 16.0 | mostly desperate |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 18.7 |  |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ2_XS.gguf) | i1-IQ2_XS | 20.8 |  |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ2_S.gguf) | i1-IQ2_S | 21.8 |  |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ2_M.gguf) | i1-IQ2_M | 23.7 |  |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q2_K.gguf) | i1-Q2_K | 25.9 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 27.4 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ3_XS.gguf) | i1-IQ3_XS | 28.6 |  |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q3_K_XS.gguf) | i1-Q3_K_XS | 28.7 |  |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ3_S.gguf) | i1-IQ3_S | 30.3 | beats Q3_K* |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q3_K_S.gguf) | i1-Q3_K_S | 30.3 | IQ3_XS probably better |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ3_M.gguf) | i1-IQ3_M | 31.4 |  |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q3_K_M.gguf) | i1-Q3_K_M | 33.7 | IQ3_S probably better |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q3_K_L.gguf) | i1-Q3_K_L | 36.6 | IQ3_M probably better |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ4_XS.gguf) | i1-IQ4_XS | 37.2 |  |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-IQ4_NL.gguf) | i1-IQ4_NL | 39.4 | prefer IQ4_XS |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q4_0.gguf) | i1-Q4_0 | 39.4 | fast, low quality |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q4_K_S.gguf) | i1-Q4_K_S | 39.7 | optimal size/speed/quality |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q4_K_M.gguf) | i1-Q4_K_M | 41.8 | fast, recommended |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q5_K_S.gguf) | i1-Q5_K_S | 47.9 |  |
| [GGUF](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q5_K_M.gguf) | i1-Q5_K_M | 49.2 |  |
| [PART 1](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q6_K.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/GOAT-70B-Storytelling-i1-GGUF/resolve/main/GOAT-70B-Storytelling.i1-Q6_K.gguf.part2of2) | i1-Q6_K | 57.0 | practically like static Q6_K |

Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## FAQ / Model Request

See https://huggingface.co/mradermacher/model_requests for some answers to
questions you might have and/or if you want some other model quantized.

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.

<!-- end -->