bartowski commited on
Commit
7290581
1 Parent(s): 4695e01

Llamacpp quants

Browse files
.gitattributes CHANGED
@@ -33,3 +33,19 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ gemma-1.1-2b-it-IQ3_M.gguf filter=lfs diff=lfs merge=lfs -text
37
+ gemma-1.1-2b-it-IQ3_S.gguf filter=lfs diff=lfs merge=lfs -text
38
+ gemma-1.1-2b-it-IQ4_NL.gguf filter=lfs diff=lfs merge=lfs -text
39
+ gemma-1.1-2b-it-IQ4_XS.gguf filter=lfs diff=lfs merge=lfs -text
40
+ gemma-1.1-2b-it-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
41
+ gemma-1.1-2b-it-Q3_K_L.gguf filter=lfs diff=lfs merge=lfs -text
42
+ gemma-1.1-2b-it-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
43
+ gemma-1.1-2b-it-Q3_K_S.gguf filter=lfs diff=lfs merge=lfs -text
44
+ gemma-1.1-2b-it-Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
45
+ gemma-1.1-2b-it-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
46
+ gemma-1.1-2b-it-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
47
+ gemma-1.1-2b-it-Q5_0.gguf filter=lfs diff=lfs merge=lfs -text
48
+ gemma-1.1-2b-it-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
49
+ gemma-1.1-2b-it-Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
50
+ gemma-1.1-2b-it-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
51
+ gemma-1.1-2b-it-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ widget:
4
+ - messages:
5
+ - role: user
6
+ content: How does the brain work?
7
+ inference:
8
+ parameters:
9
+ max_new_tokens: 200
10
+ extra_gated_heading: Access Gemma on Hugging Face
11
+ extra_gated_prompt: >-
12
+ To access Gemma on Hugging Face, you’re required to review and agree to
13
+ Google’s usage license. To do this, please ensure you’re logged-in to Hugging
14
+ Face and click below. Requests are processed immediately.
15
+ extra_gated_button_content: Acknowledge license
16
+ license: gemma
17
+ quantized_by: bartowski
18
+ pipeline_tag: text-generation
19
+ ---
20
+
21
+ ## Llamacpp Quantizations of gemma-1.1-2b-it
22
+
23
+ Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b2589">b2589</a> for quantization.
24
+
25
+ Original model: https://huggingface.co/google/gemma-1.1-2b-it
26
+
27
+ Download a file (not the whole branch) from below:
28
+
29
+ | Filename | Quant type | File Size | Description |
30
+ | -------- | ---------- | --------- | ----------- |
31
+ | [gemma-1.1-2b-it-Q8_0.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q8_0.gguf) | Q8_0 | 2.66GB | Extremely high quality, generally unneeded but max available quant. |
32
+ | [gemma-1.1-2b-it-Q6_K.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q6_K.gguf) | Q6_K | 2.06GB | Very high quality, near perfect, *recommended*. |
33
+ | [gemma-1.1-2b-it-Q5_K_M.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q5_K_M.gguf) | Q5_K_M | 1.83GB | High quality, *recommended*. |
34
+ | [gemma-1.1-2b-it-Q5_K_S.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q5_K_S.gguf) | Q5_K_S | 1.79GB | High quality, *recommended*. |
35
+ | [gemma-1.1-2b-it-Q5_0.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q5_0.gguf) | Q5_0 | 1.79GB | High quality, older format, generally not recommended. |
36
+ | [gemma-1.1-2b-it-Q4_K_M.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q4_K_M.gguf) | Q4_K_M | 1.63GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
37
+ | [gemma-1.1-2b-it-Q4_K_S.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q4_K_S.gguf) | Q4_K_S | 1.55GB | Slightly lower quality with small space savings. |
38
+ | [gemma-1.1-2b-it-IQ4_NL.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-IQ4_NL.gguf) | IQ4_NL | 1.56GB | Decent quality, similar to Q4_K_S, new method of quanting, *recommended*. |
39
+ | [gemma-1.1-2b-it-IQ4_XS.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-IQ4_XS.gguf) | IQ4_XS | 1.50GB | Decent quality, new method with similar performance to Q4. |
40
+ | [gemma-1.1-2b-it-Q4_0.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q4_0.gguf) | Q4_0 | 1.55GB | Decent quality, older format, generally not recommended. |
41
+ | [gemma-1.1-2b-it-Q3_K_L.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q3_K_L.gguf) | Q3_K_L | 1.46GB | Lower quality but usable, good for low RAM availability. |
42
+ | [gemma-1.1-2b-it-Q3_K_M.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q3_K_M.gguf) | Q3_K_M | 1.38GB | Even lower quality. |
43
+ | [gemma-1.1-2b-it-IQ3_M.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-IQ3_M.gguf) | IQ3_M | 1.30GB | Medium-low quality, new method with decent performance. |
44
+ | [gemma-1.1-2b-it-IQ3_S.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-IQ3_S.gguf) | IQ3_S | 1.28GB | Lower quality, new method with decent performance, recommended over Q3 quants. |
45
+ | [gemma-1.1-2b-it-Q3_K_S.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q3_K_S.gguf) | Q3_K_S | 1.28GB | Low quality, not recommended. |
46
+ | [gemma-1.1-2b-it-Q2_K.gguf](https://huggingface.co/bartowski/gemma-1.1-2b-it-GGUF/blob/main/gemma-1.1-2b-it-Q2_K.gguf) | Q2_K | 1.15GB | Extremely low quality, *not* recommended. |
47
+
48
+ Want to support my work? Visit my ko-fi page here: https://ko-fi.com/bartowski
gemma-1.1-2b-it-IQ3_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:00a5fbfc4e159681da16464f6faf41b73d1d4589d469d5515289a82ae4d408aa
3
+ size 1308174048
gemma-1.1-2b-it-IQ3_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8d3de552d7fb8d864ef2aed4eaadbc49d1213b16490cfc0cc06b4ae9e5296306
3
+ size 1289234144
gemma-1.1-2b-it-IQ4_NL.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b86787843372c84d80be69b526e1297bfd3927d3f3caf545d37827002c45542
3
+ size 1560757984
gemma-1.1-2b-it-IQ4_XS.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c57b8d9ae9a42389c2c272dcd25f5e503804e07f663448e2c793663992f4239a
3
+ size 1501218528
gemma-1.1-2b-it-Q2_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd565dc7ce9f11dc840895c02506dcf5a7f7f595d7837d5eb7816b9fdfa7679d
3
+ size 1157924576
gemma-1.1-2b-it-Q3_K_L.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ee84ba9d80c8fe63ecbb92882dbfe41b6ae65950c63e410962f93a6a9f2570a
3
+ size 1465591520
gemma-1.1-2b-it-Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d8aaa0dab9fae50249b5bed1ee880669fc0199d551ae757e47e9a52a49810e6d
3
+ size 1383802592
gemma-1.1-2b-it-Q3_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf52d95e1f23e8a1f29a85a4bc86accc6ad834e03df7eb2e5a432355db60b1dd
3
+ size 1287980768
gemma-1.1-2b-it-Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5ec0c6be4f9ccd657518e9304c1912c872939884748f0927a23b0d542aa4043c
3
+ size 1551189728
gemma-1.1-2b-it-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cc2118e1d780fa33582738d8c99223d62c8734b06ef65076c01618d484d081d4
3
+ size 1630263008
gemma-1.1-2b-it-Q4_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:381472785f7f3a4a72c1f76b3351d9eb2686836b8587d389e7d2afb82f6f48f4
3
+ size 1559840480
gemma-1.1-2b-it-Q5_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:74b19076a50159b0af56ce92270ce53430d2ec5e00e1cddc1fab3c88683f09f7
3
+ size 1798915808
gemma-1.1-2b-it-Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c19111998d075a9e9f2241c0ebfdd331be0e74e68633637d8a89af832fb3b4e
3
+ size 1839650528
gemma-1.1-2b-it-Q5_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b2eefbefa4ee3567eb7cbe891caa6550bae83c6c1d6f1a8c4cfe8aef42f09c52
3
+ size 1798915808
gemma-1.1-2b-it-Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6c1e783072c32fdd56a60eb3dff33dac0126e8657b9d7bd558fad7dbeceefad4
3
+ size 2062124768
gemma-1.1-2b-it-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:769d716580c94c874864da0991e54a53d27ead0760b419e38a2d56bcfc4d4f8d
3
+ size 2669070048