Jared Van Bortel commited on
Commit
1a21b22
1 Parent(s): 6d68b9d

add GGUF quants

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.gguf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,75 @@
1
  ---
 
 
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: nomic-ai/nomic-embed-text-v1.5
3
+ inference: false
4
+ language:
5
+ - en
6
  license: apache-2.0
7
+ model_creator: Nomic
8
+ model_name: nomic-embed-text-v1.5
9
+ model_type: bert
10
+ pipeline_tag: sentence-similarity
11
+ quantized_by: Nomic
12
+ tags:
13
+ - feature-extraction
14
+ - sentence-similarity
15
  ---
16
+
17
+ # nomic-embed-text-v1.5 - GGUF
18
+
19
+ Original model: [nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5)
20
+
21
+
22
+ ## Description
23
+
24
+ This repo contains llama.cpp-compatible files for [nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) in GGUF format.
25
+
26
+ llama.cpp will default to 2048 tokens of context with these files. To use the full 8192 tokens that Nomic Embed is benchmarked on, you will have to choose a context extension method. The original model uses Dynamic NTK-Aware RoPE scaling, but that is not currently available in llama.cpp. A combination of YaRN and linear scaling is an acceptable substitute.
27
+
28
+ These files were converted and quantized with llama.cpp commit [594fca3fe](https://github.com/ggerganov/llama.cpp/commit/594fca3fefe27b8e95cfb1656eb0e160ad15a793).
29
+
30
+ ## Example `llama.cpp` Command
31
+
32
+ Compute a single embedding:
33
+ ```shell
34
+ ./embedding -ngl 99 -m nomic-embed-text-v1.5.f16.gguf -c 8192 -b 8192 --rope-scaling yarn --rope-freq-scale .75 -p 'search_query: What is TSNE?'
35
+ ```
36
+
37
+ You can also submit a batch of texts to embed, as long as the total number of tokens does not exceed the context length. Only the first three embeddings are shown by the `embedding` example.
38
+
39
+ texts.txt:
40
+ ```
41
+ search_query: What is TSNE?
42
+ search_query: Who is Laurens Van der Maaten?
43
+ ```
44
+
45
+ Compute multiple embeddings:
46
+ ```shell
47
+ ./embedding -ngl 99 -m nomic-embed-text-v1.5.f16.gguf -c 8192 -b 8192 --rope-scaling yarn --rope-freq-scale .75 -f texts.txt
48
+ ```
49
+
50
+
51
+ ## Compatibility
52
+
53
+ These files are compatible with llama.cpp as commit [ea9c8e114](https://github.com/ggerganov/llama.cpp/commit/ea9c8e11436ad50719987fa23a289c74b7b40d40) from 2/13/2024.
54
+
55
+
56
+ ## Provided Files
57
+
58
+ The below table shows the mean squared error of the embeddings produced by these quantizations of Nomic Embed relative to the Sentence Transformers implementation.
59
+
60
+ Name | Quant | Size | MSE
61
+ -----|-------|------|-----
62
+ [nomic-embed-text-v1.5.Q2\_K.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q2_K.gguf) | Q2\_K | 48 MiB | 2.33e-03
63
+ [nomic-embed-text-v1.5.Q3\_K\_S.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q3_K_S.gguf) | Q3\_K\_S | 57 MiB | 1.19e-03
64
+ [nomic-embed-text-v1.5.Q3\_K\_M.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q3_K_M.gguf) | Q3\_K\_M | 65 MiB | 8.26e-04
65
+ [nomic-embed-text-v1.5.Q3\_K\_L.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q3_K_L.gguf) | Q3\_K\_L | 69 MiB | 7.93e-04
66
+ [nomic-embed-text-v1.5.Q4\_0.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q4_0.gguf) | Q4\_0 | 75 MiB | 6.32e-04
67
+ [nomic-embed-text-v1.5.Q4\_K\_S.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q4_K_S.gguf) | Q4\_K\_S | 75 MiB | 6.71e-04
68
+ [nomic-embed-text-v1.5.Q4\_K\_M.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q4_K_M.gguf) | Q4\_K\_M | 81 MiB | 2.42e-04
69
+ [nomic-embed-text-v1.5.Q5\_0.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q5_0.gguf) | Q5\_0 | 91 MiB | 2.35e-04
70
+ [nomic-embed-text-v1.5.Q5\_K\_S.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q5_K_S.gguf) | Q5\_K\_S | 91 MiB | 2.00e-04
71
+ [nomic-embed-text-v1.5.Q5\_K\_M.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q5_K_M.gguf) | Q5\_K\_M | 95 MiB | 6.55e-05
72
+ [nomic-embed-text-v1.5.Q6\_K.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q6_K.gguf) | Q6\_K | 108 MiB | 5.58e-05
73
+ [nomic-embed-text-v1.5.Q8\_0.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.Q8_0.gguf) | Q8\_0 | 140 MiB | 5.79e-06
74
+ [nomic-embed-text-v1.5.f16.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.f16.gguf) | F16 | 262 MiB | 4.21e-10
75
+ [nomic-embed-text-v1.5.f32.gguf](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/blob/main/nomic-embed-text-v1.5.f32.gguf) | F32 | 262 MiB | 6.08e-11
nomic-embed-text-v1.5.Q2_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:758227679b93e653eb712c442a16fcdf7d65b15a6eefd6ce8053edd7f42d82a4
3
+ size 49361088
nomic-embed-text-v1.5.Q3_K_L.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:139edcebb55846c0c7685b57e25c9758c67f9749a1276a0e886031b9a5bc63ff
3
+ size 71593088
nomic-embed-text-v1.5.Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7140b9170efe9912ec4be3f19d4492050f7be8dc9492ac99a1892bcadca30e9c
3
+ size 67169408
nomic-embed-text-v1.5.Q3_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b13eb7c4ed0ec144435f2a47f49acc04ce79c3081b5bc72f5a717da0905c7a9
3
+ size 59649152
nomic-embed-text-v1.5.Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01e3f351b0c697201d6901d38b29c9e0affa06c65b040359546c822198aa8f64
3
+ size 77802880
nomic-embed-text-v1.5.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e3b6f681b81772c5d29b3b4f6b01c0e71ee64febba159ae182d3a238ccb63fb
3
+ size 84106624
nomic-embed-text-v1.5.Q4_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1acc092631aeb6fd52009c3d2b494fbee5c474ae6275e22bebb7c7d032a71b15
3
+ size 78097792
nomic-embed-text-v1.5.Q5_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ca07f2eba2354436bb0be187c6df15a1e754975f1f10f299c6513914d95a0b71
3
+ size 94888768
nomic-embed-text-v1.5.Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b4795c9f018653bea1626d8f95cf09d0ee0fabcbc14dfdca3f7f9320b352bd3
3
+ size 99588928
nomic-embed-text-v1.5.Q5_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2c341f67341b9093a409710fb578b3f972f3c18ad2c8a0303cacf99529394cd
3
+ size 94888768
nomic-embed-text-v1.5.Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e6a20b714d35f2e2f9a1e418f08826ac9303ac998961c26c8634196658a13867
3
+ size 113042528
nomic-embed-text-v1.5.Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:880522894afda99389f2c979aed97c085d0f905eba61ef196307f95245754114
3
+ size 146146432
nomic-embed-text-v1.5.f16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e4060a62a157f92458ac71dc610a8909a9ad7ba1f1b4aaa52f121e207c1f9c63
3
+ size 274290560
nomic-embed-text-v1.5.f32.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab33b0bd5d1d3e993c37f9f580c8a17bbb17309997096c5d88308da2c6a4fa63
3
+ size 547664768