Quant for 3.0
Browse files
README.md
CHANGED
@@ -1,39 +1,21 @@
|
|
1 |
---
|
2 |
-
license: apache-2.0
|
3 |
-
tags:
|
4 |
-
- moe
|
5 |
quantized_by: bartowski
|
6 |
-
pipeline_tag: text-generation
|
7 |
---
|
8 |
|
9 |
-
|
10 |
|
11 |
Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.0.11">turboderp's ExLlamaV2 v0.0.11</a> for quantization.
|
12 |
|
13 |
-
## The "main" branch only contains the measurement.json, download one of the other branches for the model (see below)
|
14 |
-
|
15 |
-
Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.
|
16 |
-
|
17 |
Conversion was done using the default calibration dataset.
|
18 |
|
19 |
-
Default arguments used except when the bits per weight is above 6.0, at that point the lm_head layer is quantized at 8 bits per weight instead of the default 6.
|
20 |
-
|
21 |
Original model: https://huggingface.co/mlabonne/Beyonder-4x7B-v2
|
22 |
-
|
23 |
-
<a href="https://huggingface.co/bartowski/Beyonder-4x7B-v2-exl2/tree/3_5">3.5 bits per weight</a>
|
24 |
-
|
25 |
-
<a href="https://huggingface.co/bartowski/Beyonder-4x7B-v2-exl2/tree/3_75">3.75 bits per weight</a>
|
26 |
-
|
27 |
-
<a href="https://huggingface.co/bartowski/Beyonder-4x7B-v2-exl2/tree/4_5">4.5 bits per weight</a>
|
28 |
-
|
29 |
-
<a href="https://huggingface.co/bartowski/Beyonder-4x7B-v2-exl2/tree/6_5">6.5 bits per weight</a>
|
30 |
|
31 |
## Download instructions
|
32 |
|
33 |
With git:
|
34 |
|
35 |
```shell
|
36 |
-
git clone --single-branch --branch
|
37 |
```
|
38 |
|
39 |
With huggingface hub (credit to TheBloke for instructions):
|
@@ -42,16 +24,9 @@ With huggingface hub (credit to TheBloke for instructions):
|
|
42 |
pip3 install huggingface-hub
|
43 |
```
|
44 |
|
45 |
-
To download the `main` (only useful if you only care about measurement.json) branch to a folder called `Beyonder-4x7B-v2-exl2`:
|
46 |
-
|
47 |
-
```shell
|
48 |
-
mkdir Beyonder-4x7B-v2-exl2
|
49 |
-
huggingface-cli download bartowski/Beyonder-4x7B-v2-exl2 --local-dir Beyonder-4x7B-v2-exl2 --local-dir-use-symlinks False
|
50 |
-
```
|
51 |
-
|
52 |
To download from a different branch, add the `--revision` parameter:
|
53 |
|
54 |
```shell
|
55 |
mkdir Beyonder-4x7B-v2-exl2
|
56 |
-
huggingface-cli download bartowski/Beyonder-4x7B-v2-exl2 --revision
|
57 |
```
|
|
|
1 |
---
|
|
|
|
|
|
|
2 |
quantized_by: bartowski
|
|
|
3 |
---
|
4 |
|
5 |
+
# Exllama v2 Quantizations of Beyonder-4x7B-v2 at 3.0 bits per weight
|
6 |
|
7 |
Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.0.11">turboderp's ExLlamaV2 v0.0.11</a> for quantization.
|
8 |
|
|
|
|
|
|
|
|
|
9 |
Conversion was done using the default calibration dataset.
|
10 |
|
|
|
|
|
11 |
Original model: https://huggingface.co/mlabonne/Beyonder-4x7B-v2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
|
13 |
## Download instructions
|
14 |
|
15 |
With git:
|
16 |
|
17 |
```shell
|
18 |
+
git clone --single-branch --branch 3.0 https://huggingface.co/bartowski/Beyonder-4x7B-v2-exl2
|
19 |
```
|
20 |
|
21 |
With huggingface hub (credit to TheBloke for instructions):
|
|
|
24 |
pip3 install huggingface-hub
|
25 |
```
|
26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
27 |
To download from a different branch, add the `--revision` parameter:
|
28 |
|
29 |
```shell
|
30 |
mkdir Beyonder-4x7B-v2-exl2
|
31 |
+
huggingface-cli download bartowski/Beyonder-4x7B-v2-exl2 --revision 3_0 --local-dir Beyonder-4x7B-v2-exl2 --local-dir-use-symlinks False
|
32 |
```
|