File size: 13,131 Bytes
d792b77
1cfb354
 
a8754aa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d792b77
1cfb354
 
d792b77
1cfb354
c0a96cd
1cfb354
00f763a
1cfb354
 
 
 
 
 
 
 
 
 
 
 
1e72cdd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1cfb354
f52564f
 
1cfb354
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
---
tags:
- sparse sparsity quantized onnx embeddings int8
- mteb
- mteb
model-index:
- name: gte-large-quant
  results:
  - task:
      type: STS
    dataset:
      type: mteb/biosses-sts
      name: MTEB BIOSSES
      config: default
      split: test
      revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
    metrics:
    - type: cos_sim_pearson
      value: 90.27260027646717
    - type: cos_sim_spearman
      value: 87.97790825077952
    - type: euclidean_pearson
      value: 88.42832241523092
    - type: euclidean_spearman
      value: 87.97248644049293
    - type: manhattan_pearson
      value: 88.13802465778512
    - type: manhattan_spearman
      value: 87.43391995202266
  - task:
      type: STS
    dataset:
      type: mteb/sickr-sts
      name: MTEB SICK-R
      config: default
      split: test
      revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
    metrics:
    - type: cos_sim_pearson
      value: 85.1416039713116
    - type: cos_sim_spearman
      value: 79.13359419669726
    - type: euclidean_pearson
      value: 83.08042050989465
    - type: euclidean_spearman
      value: 79.31565112619433
    - type: manhattan_pearson
      value: 83.10376638254372
    - type: manhattan_spearman
      value: 79.30772376012946
  - task:
      type: STS
    dataset:
      type: mteb/sts12-sts
      name: MTEB STS12
      config: default
      split: test
      revision: a0d554a64d88156834ff5ae9920b964011b16384
    metrics:
    - type: cos_sim_pearson
      value: 84.93030439955828
    - type: cos_sim_spearman
      value: 75.98104622572393
    - type: euclidean_pearson
      value: 81.20791722502764
    - type: euclidean_spearman
      value: 75.74595761987686
    - type: manhattan_pearson
      value: 81.23169425598003
    - type: manhattan_spearman
      value: 75.73065403644094
  - task:
      type: STS
    dataset:
      type: mteb/sts13-sts
      name: MTEB STS13
      config: default
      split: test
      revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
    metrics:
    - type: cos_sim_pearson
      value: 85.6693892097855
    - type: cos_sim_spearman
      value: 87.54973524492165
    - type: euclidean_pearson
      value: 86.55642466103943
    - type: euclidean_spearman
      value: 87.47921340148683
    - type: manhattan_pearson
      value: 86.52043275063926
    - type: manhattan_spearman
      value: 87.43869426658489
  - task:
      type: STS
    dataset:
      type: mteb/sts14-sts
      name: MTEB STS14
      config: default
      split: test
      revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
    metrics:
    - type: cos_sim_pearson
      value: 84.37393784507647
    - type: cos_sim_spearman
      value: 81.98702164762233
    - type: euclidean_pearson
      value: 84.22038158338351
    - type: euclidean_spearman
      value: 81.9872746771322
    - type: manhattan_pearson
      value: 84.21915949674062
    - type: manhattan_spearman
      value: 81.97923386273747
  - task:
      type: STS
    dataset:
      type: mteb/sts15-sts
      name: MTEB STS15
      config: default
      split: test
      revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
    metrics:
    - type: cos_sim_pearson
      value: 87.34477744314285
    - type: cos_sim_spearman
      value: 88.92669309789463
    - type: euclidean_pearson
      value: 88.20128441166663
    - type: euclidean_spearman
      value: 88.91524205114627
    - type: manhattan_pearson
      value: 88.24425729639415
    - type: manhattan_spearman
      value: 88.97457451709523
  - task:
      type: STS
    dataset:
      type: mteb/sts16-sts
      name: MTEB STS16
      config: default
      split: test
      revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
    metrics:
    - type: cos_sim_pearson
      value: 82.11827015492467
    - type: cos_sim_spearman
      value: 83.59397157586835
    - type: euclidean_pearson
      value: 82.97284591328044
    - type: euclidean_spearman
      value: 83.74509747941255
    - type: manhattan_pearson
      value: 82.974440264842
    - type: manhattan_spearman
      value: 83.72260506292083
  - task:
      type: STS
    dataset:
      type: mteb/sts17-crosslingual-sts
      name: MTEB STS17 (en-en)
      config: en-en
      split: test
      revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
    metrics:
    - type: cos_sim_pearson
      value: 88.29744487677577
    - type: cos_sim_spearman
      value: 88.50799779856109
    - type: euclidean_pearson
      value: 89.0149154609955
    - type: euclidean_spearman
      value: 88.72798794474068
    - type: manhattan_pearson
      value: 89.14318227078863
    - type: manhattan_spearman
      value: 88.98372697017017
  - task:
      type: STS
    dataset:
      type: mteb/sts22-crosslingual-sts
      name: MTEB STS22 (en)
      config: en
      split: test
      revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
    metrics:
    - type: cos_sim_pearson
      value: 70.114540107077
    - type: cos_sim_spearman
      value: 69.72244488054433
    - type: euclidean_pearson
      value: 70.03658853094686
    - type: euclidean_spearman
      value: 68.96035610557085
    - type: manhattan_pearson
      value: 69.83707789686764
    - type: manhattan_spearman
      value: 68.71831797289812
  - task:
      type: STS
    dataset:
      type: mteb/stsbenchmark-sts
      name: MTEB STSBenchmark
      config: default
      split: test
      revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
    metrics:
    - type: cos_sim_pearson
      value: 84.86664469775837
    - type: cos_sim_spearman
      value: 85.39649452953681
    - type: euclidean_pearson
      value: 85.68509956626748
    - type: euclidean_spearman
      value: 85.50984027606854
    - type: manhattan_pearson
      value: 85.6688745008871
    - type: manhattan_spearman
      value: 85.465201888803
  - task:
      type: PairClassification
    dataset:
      type: mteb/sprintduplicatequestions-pairclassification
      name: MTEB SprintDuplicateQuestions
      config: default
      split: test
      revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
    metrics:
    - type: cos_sim_accuracy
      value: 99.8079207920792
    - type: cos_sim_ap
      value: 95.62897445718106
    - type: cos_sim_f1
      value: 90.03083247687564
    - type: cos_sim_precision
      value: 92.60042283298098
    - type: cos_sim_recall
      value: 87.6
    - type: dot_accuracy
      value: 99.67029702970297
    - type: dot_ap
      value: 90.20258347721159
    - type: dot_f1
      value: 83.06172839506172
    - type: dot_precision
      value: 82.04878048780488
    - type: dot_recall
      value: 84.1
    - type: euclidean_accuracy
      value: 99.80594059405941
    - type: euclidean_ap
      value: 95.53963697283662
    - type: euclidean_f1
      value: 89.92405063291139
    - type: euclidean_precision
      value: 91.07692307692308
    - type: euclidean_recall
      value: 88.8
    - type: manhattan_accuracy
      value: 99.80594059405941
    - type: manhattan_ap
      value: 95.55714505339634
    - type: manhattan_f1
      value: 90.06085192697769
    - type: manhattan_precision
      value: 91.35802469135803
    - type: manhattan_recall
      value: 88.8
    - type: max_accuracy
      value: 99.8079207920792
    - type: max_ap
      value: 95.62897445718106
    - type: max_f1
      value: 90.06085192697769
  - task:
      type: PairClassification
    dataset:
      type: mteb/twittersemeval2015-pairclassification
      name: MTEB TwitterSemEval2015
      config: default
      split: test
      revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
    metrics:
    - type: cos_sim_accuracy
      value: 85.87351731537224
    - type: cos_sim_ap
      value: 72.87360532701162
    - type: cos_sim_f1
      value: 67.8826895565093
    - type: cos_sim_precision
      value: 61.918225315354505
    - type: cos_sim_recall
      value: 75.11873350923483
    - type: dot_accuracy
      value: 80.15139774691542
    - type: dot_ap
      value: 53.5201503222712
    - type: dot_f1
      value: 53.42203179614388
    - type: dot_precision
      value: 46.64303996849773
    - type: dot_recall
      value: 62.50659630606861
    - type: euclidean_accuracy
      value: 85.87351731537224
    - type: euclidean_ap
      value: 73.10465263888227
    - type: euclidean_f1
      value: 68.38209376101516
    - type: euclidean_precision
      value: 61.63948316034739
    - type: euclidean_recall
      value: 76.78100263852242
    - type: manhattan_accuracy
      value: 85.83775406806939
    - type: manhattan_ap
      value: 73.08358693248583
    - type: manhattan_f1
      value: 68.34053485927829
    - type: manhattan_precision
      value: 61.303163628745025
    - type: manhattan_recall
      value: 77.20316622691293
    - type: max_accuracy
      value: 85.87351731537224
    - type: max_ap
      value: 73.10465263888227
    - type: max_f1
      value: 68.38209376101516
  - task:
      type: PairClassification
    dataset:
      type: mteb/twitterurlcorpus-pairclassification
      name: MTEB TwitterURLCorpus
      config: default
      split: test
      revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
    metrics:
    - type: cos_sim_accuracy
      value: 88.85202002561415
    - type: cos_sim_ap
      value: 85.58170945333845
    - type: cos_sim_f1
      value: 77.87783280804442
    - type: cos_sim_precision
      value: 75.95140515222482
    - type: cos_sim_recall
      value: 79.90452725592854
    - type: dot_accuracy
      value: 85.29902588582296
    - type: dot_ap
      value: 76.95795800483633
    - type: dot_f1
      value: 71.30231900452489
    - type: dot_precision
      value: 65.91503267973856
    - type: dot_recall
      value: 77.6485987064983
    - type: euclidean_accuracy
      value: 88.80738929638684
    - type: euclidean_ap
      value: 85.5344499509856
    - type: euclidean_f1
      value: 77.9805854353285
    - type: euclidean_precision
      value: 75.97312495435624
    - type: euclidean_recall
      value: 80.09701262704034
    - type: manhattan_accuracy
      value: 88.7782822990647
    - type: manhattan_ap
      value: 85.52577812395661
    - type: manhattan_f1
      value: 77.97958958110746
    - type: manhattan_precision
      value: 74.76510067114094
    - type: manhattan_recall
      value: 81.48290729904527
    - type: max_accuracy
      value: 88.85202002561415
    - type: max_ap
      value: 85.58170945333845
    - type: max_f1
      value: 77.9805854353285
license: mit
language:
- en
---

# gte-large-quant

This is the quantized (INT8) ONNX variant of the [gte-large](https://huggingface.co/thenlper/gte-large) embeddings model created with [DeepSparse Optimum](https://github.com/neuralmagic/optimum-deepsparse) for ONNX export/inference and Neural Magic's [Sparsify](https://github.com/neuralmagic/sparsify) for one-shot quantization.

Current list of sparse and quantized gte ONNX models:

| Links                                                                                               | Sparsification Method |
| --------------------------------------------------------------------------------------------------- | ---------------------- |
| [zeroshot/gte-large-sparse](https://huggingface.co/zeroshot/gte-large-sparse)     |    Quantization (INT8) & 50% Pruning                    |
| [zeroshot/gte-large-quant](https://huggingface.co/zeroshot/gte-large-quant)     |   Quantization (INT8)                     |
| [zeroshot/gte-base-sparse](https://huggingface.co/zeroshot/gte-base-sparse)     |    Quantization (INT8) & 50% Pruning                    |
| [zeroshot/gte-base-quant](https://huggingface.co/zeroshot/gte-base-quant)     |   Quantization (INT8)                     |
| [zeroshot/gte-small-sparse](https://huggingface.co/zeroshot/gte-small-sparse)     |    Quantization (INT8) & 50% Pruning                    |
| [zeroshot/gte-small-quant](https://huggingface.co/zeroshot/gte-small-quant)     |   Quantization (INT8)                     |

```bash
pip install -U deepsparse-nightly[sentence_transformers]
```

```python
from deepsparse.sentence_transformers import SentenceTransformer
model = SentenceTransformer('zeroshot/gte-large-quant', export=False)

# Our sentences we like to encode
sentences = ['This framework generates embeddings for each input sentence',
    'Sentences are passed as a list of string.',
    'The quick brown fox jumps over the lazy dog.']

# Sentences are encoded by calling model.encode()
embeddings = model.encode(sentences)

# Print the embeddings
for sentence, embedding in zip(sentences, embeddings):
    print("Sentence:", sentence)
    print("Embedding:", embedding.shape)
    print("")
```

For further details regarding DeepSparse & Sentence Transformers integration, refer to the [DeepSparse README](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/sentence_transformers).

For general questions on these models and sparsification methods, reach out to the engineering team on our [community Slack](https://join.slack.com/t/discuss-neuralmagic/shared_invite/zt-q1a1cnvo-YBoICSIw3L1dmQpjBeDurQ).

![;)](https://media.giphy.com/media/bYg33GbNbNIVzSrr84/giphy-downsized-large.gif)