Transformers
GGUF
English
Inference Endpoints
File size: 6,546 Bytes
589fead
e063159
0255c6b
 
 
 
 
 
 
589fead
0255c6b
 
589fead
bdcfd4a
589fead
bdcfd4a
 
937557f
bdcfd4a
 
 
 
 
 
589fead
 
fd460c0
 
589fead
 
937557f
c1a8eb6
ed7a337
 
c1a8eb6
 
ed7a337
ae1cc0d
c1a8eb6
ed7a337
c1a8eb6
ed7a337
c1a8eb6
ed7a337
 
c1a8eb6
 
08e9560
ae1cc0d
c1a8eb6
 
 
589fead
1fa250b
 
fd460c0
1fa250b
fd460c0
ac558b2
 
 
e063159
 
 
 
 
0b012ba
 
 
 
 
 
589fead
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
base_model: cognitivecomputations/MegaDolphin-120b
datasets:
- ehartford/dolphin
- jondurbin/airoboros-2.2.1
- ehartford/samantha-data
- ehartford/WizardLM_evol_instruct_V2_196k_unfiltered_merged_split
language:
- en
library_name: transformers
license: llama2
quantized_by: mradermacher
---
## About

weighted/imatrix quants of https://huggingface.co/cognitivecomputations/MegaDolphin-120b

<!-- provided-files -->
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ1_S.gguf) | i1-IQ1_S | 25.7 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ1_M.gguf) | i1-IQ1_M | 27.8 | mostly desperate |
| [GGUF](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 32.2 |  |
| [GGUF](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ2_XS.gguf) | i1-IQ2_XS | 35.8 |  |
| [GGUF](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ2_S.gguf) | i1-IQ2_S | 37.2 |  |
| [GGUF](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ2_M.gguf) | i1-IQ2_M | 40.5 |  |
| [GGUF](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q2_K.gguf) | i1-Q2_K | 44.6 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 47.3 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ3_XS.gguf) | i1-IQ3_XS | 49.3 |  |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q3_K_XS.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q3_K_XS.gguf.split-ab) | i1-Q3_K_XS | 49.3 |  |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ3_S.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ3_S.gguf.part2of2) | i1-IQ3_S | 52.1 | beats Q3_K* |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q3_K_S.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q3_K_S.gguf.split-ab) | i1-Q3_K_S | 52.2 | IQ3_XS probably better |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ3_M.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ3_M.gguf.part2of2) | i1-IQ3_M | 53.8 |  |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q3_K_M.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q3_K_M.gguf.split-ab) | i1-Q3_K_M | 58.2 | IQ3_S probably better |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q3_K_L.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q3_K_L.gguf.split-ab) | i1-Q3_K_L | 63.4 | IQ3_M probably better |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ4_XS.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-IQ4_XS.gguf.part2of2) | i1-IQ4_XS | 64.3 |  |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q4_0.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q4_0.gguf.part2of2) | i1-Q4_0 | 68.1 | fast, low quality |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q4_K_S.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q4_K_S.gguf.split-ab) | i1-Q4_K_S | 68.7 | optimal size/speed/quality |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q4_K_M.gguf.split-aa) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q4_K_M.gguf.split-ab) | i1-Q4_K_M | 72.6 | fast, recommended |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q5_K_S.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q5_K_S.gguf.part2of2) | i1-Q5_K_S | 82.9 |  |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q5_K_M.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q5_K_M.gguf.part2of2) | i1-Q5_K_M | 85.1 |  |
| [PART 1](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q6_K.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/MegaDolphin-120b-i1-GGUF/resolve/main/MegaDolphin-120b.i1-Q6_K.gguf.part2of2) | i1-Q6_K | 98.8 | practically like static Q6_K |

Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## FAQ / Model Request

See https://huggingface.co/mradermacher/model_requests for some answers to
questions you might have and/or if you want some other model quantized.

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.

<!-- end -->