davidmezzetti commited on
Commit
5a4d956
·
1 Parent(s): bde2a8d

Upload model

Browse files
1_Dense/config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"in_features": 128, "out_features": 128, "bias": false, "activation_function": "torch.nn.modules.linear.Identity"}
1_Dense/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:84ba720a1b4d7ce9fb1aaf6fc056db332fa2ee90456b5d6cdeaf0528d9187283
3
+ size 65624
README.md ADDED
@@ -0,0 +1,1086 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - ColBERT
4
+ - PyLate
5
+ - sentence-transformers
6
+ - sentence-similarity
7
+ - feature-extraction
8
+ - generated_from_trainer
9
+ - dataset_size:640000
10
+ - loss:Distillation
11
+ base_model: google/bert_uncased_L-2_H-128_A-2
12
+ datasets:
13
+ - lightonai/ms-marco-en-bge-gemma-unnormalized
14
+ pipeline_tag: sentence-similarity
15
+ library_name: PyLate
16
+ license: apache-2.0
17
+ metrics:
18
+ - MaxSim_accuracy@1
19
+ - MaxSim_accuracy@3
20
+ - MaxSim_accuracy@5
21
+ - MaxSim_accuracy@10
22
+ - MaxSim_precision@1
23
+ - MaxSim_precision@3
24
+ - MaxSim_precision@5
25
+ - MaxSim_precision@10
26
+ - MaxSim_recall@1
27
+ - MaxSim_recall@3
28
+ - MaxSim_recall@5
29
+ - MaxSim_recall@10
30
+ - MaxSim_ndcg@10
31
+ - MaxSim_mrr@10
32
+ - MaxSim_map@100
33
+ model-index:
34
+ - name: ColBERT MUVERA Micro
35
+ results:
36
+ - task:
37
+ type: py-late-information-retrieval
38
+ name: Py Late Information Retrieval
39
+ dataset:
40
+ name: NanoClimateFEVER
41
+ type: NanoClimateFEVER
42
+ metrics:
43
+ - type: MaxSim_accuracy@1
44
+ value: 0.26
45
+ name: Maxsim Accuracy@1
46
+ - type: MaxSim_accuracy@3
47
+ value: 0.36
48
+ name: Maxsim Accuracy@3
49
+ - type: MaxSim_accuracy@5
50
+ value: 0.4
51
+ name: Maxsim Accuracy@5
52
+ - type: MaxSim_accuracy@10
53
+ value: 0.58
54
+ name: Maxsim Accuracy@10
55
+ - type: MaxSim_precision@1
56
+ value: 0.26
57
+ name: Maxsim Precision@1
58
+ - type: MaxSim_precision@3
59
+ value: 0.12666666666666665
60
+ name: Maxsim Precision@3
61
+ - type: MaxSim_precision@5
62
+ value: 0.092
63
+ name: Maxsim Precision@5
64
+ - type: MaxSim_precision@10
65
+ value: 0.07800000000000001
66
+ name: Maxsim Precision@10
67
+ - type: MaxSim_recall@1
68
+ value: 0.11233333333333333
69
+ name: Maxsim Recall@1
70
+ - type: MaxSim_recall@3
71
+ value: 0.16066666666666665
72
+ name: Maxsim Recall@3
73
+ - type: MaxSim_recall@5
74
+ value: 0.184
75
+ name: Maxsim Recall@5
76
+ - type: MaxSim_recall@10
77
+ value: 0.3206666666666667
78
+ name: Maxsim Recall@10
79
+ - type: MaxSim_ndcg@10
80
+ value: 0.24408616743142095
81
+ name: Maxsim Ndcg@10
82
+ - type: MaxSim_mrr@10
83
+ value: 0.33196825396825397
84
+ name: Maxsim Mrr@10
85
+ - type: MaxSim_map@100
86
+ value: 0.18128382432733356
87
+ name: Maxsim Map@100
88
+ - task:
89
+ type: py-late-information-retrieval
90
+ name: Py Late Information Retrieval
91
+ dataset:
92
+ name: NanoDBPedia
93
+ type: NanoDBPedia
94
+ metrics:
95
+ - type: MaxSim_accuracy@1
96
+ value: 0.68
97
+ name: Maxsim Accuracy@1
98
+ - type: MaxSim_accuracy@3
99
+ value: 0.86
100
+ name: Maxsim Accuracy@3
101
+ - type: MaxSim_accuracy@5
102
+ value: 0.92
103
+ name: Maxsim Accuracy@5
104
+ - type: MaxSim_accuracy@10
105
+ value: 0.94
106
+ name: Maxsim Accuracy@10
107
+ - type: MaxSim_precision@1
108
+ value: 0.68
109
+ name: Maxsim Precision@1
110
+ - type: MaxSim_precision@3
111
+ value: 0.6066666666666667
112
+ name: Maxsim Precision@3
113
+ - type: MaxSim_precision@5
114
+ value: 0.56
115
+ name: Maxsim Precision@5
116
+ - type: MaxSim_precision@10
117
+ value: 0.502
118
+ name: Maxsim Precision@10
119
+ - type: MaxSim_recall@1
120
+ value: 0.05322585293904511
121
+ name: Maxsim Recall@1
122
+ - type: MaxSim_recall@3
123
+ value: 0.16789568954347403
124
+ name: Maxsim Recall@3
125
+ - type: MaxSim_recall@5
126
+ value: 0.22988072374930787
127
+ name: Maxsim Recall@5
128
+ - type: MaxSim_recall@10
129
+ value: 0.35043982767195947
130
+ name: Maxsim Recall@10
131
+ - type: MaxSim_ndcg@10
132
+ value: 0.6003406576207015
133
+ name: Maxsim Ndcg@10
134
+ - type: MaxSim_mrr@10
135
+ value: 0.7850000000000001
136
+ name: Maxsim Mrr@10
137
+ - type: MaxSim_map@100
138
+ value: 0.4687280514608297
139
+ name: Maxsim Map@100
140
+ - task:
141
+ type: py-late-information-retrieval
142
+ name: Py Late Information Retrieval
143
+ dataset:
144
+ name: NanoFEVER
145
+ type: NanoFEVER
146
+ metrics:
147
+ - type: MaxSim_accuracy@1
148
+ value: 0.72
149
+ name: Maxsim Accuracy@1
150
+ - type: MaxSim_accuracy@3
151
+ value: 0.78
152
+ name: Maxsim Accuracy@3
153
+ - type: MaxSim_accuracy@5
154
+ value: 0.84
155
+ name: Maxsim Accuracy@5
156
+ - type: MaxSim_accuracy@10
157
+ value: 0.9
158
+ name: Maxsim Accuracy@10
159
+ - type: MaxSim_precision@1
160
+ value: 0.72
161
+ name: Maxsim Precision@1
162
+ - type: MaxSim_precision@3
163
+ value: 0.2733333333333333
164
+ name: Maxsim Precision@3
165
+ - type: MaxSim_precision@5
166
+ value: 0.18
167
+ name: Maxsim Precision@5
168
+ - type: MaxSim_precision@10
169
+ value: 0.1
170
+ name: Maxsim Precision@10
171
+ - type: MaxSim_recall@1
172
+ value: 0.6866666666666668
173
+ name: Maxsim Recall@1
174
+ - type: MaxSim_recall@3
175
+ value: 0.7633333333333333
176
+ name: Maxsim Recall@3
177
+ - type: MaxSim_recall@5
178
+ value: 0.82
179
+ name: Maxsim Recall@5
180
+ - type: MaxSim_recall@10
181
+ value: 0.89
182
+ name: Maxsim Recall@10
183
+ - type: MaxSim_ndcg@10
184
+ value: 0.7955242043086649
185
+ name: Maxsim Ndcg@10
186
+ - type: MaxSim_mrr@10
187
+ value: 0.7731666666666667
188
+ name: Maxsim Mrr@10
189
+ - type: MaxSim_map@100
190
+ value: 0.7676133768765347
191
+ name: Maxsim Map@100
192
+ - task:
193
+ type: py-late-information-retrieval
194
+ name: Py Late Information Retrieval
195
+ dataset:
196
+ name: NanoFiQA2018
197
+ type: NanoFiQA2018
198
+ metrics:
199
+ - type: MaxSim_accuracy@1
200
+ value: 0.3
201
+ name: Maxsim Accuracy@1
202
+ - type: MaxSim_accuracy@3
203
+ value: 0.54
204
+ name: Maxsim Accuracy@3
205
+ - type: MaxSim_accuracy@5
206
+ value: 0.58
207
+ name: Maxsim Accuracy@5
208
+ - type: MaxSim_accuracy@10
209
+ value: 0.66
210
+ name: Maxsim Accuracy@10
211
+ - type: MaxSim_precision@1
212
+ value: 0.3
213
+ name: Maxsim Precision@1
214
+ - type: MaxSim_precision@3
215
+ value: 0.2333333333333333
216
+ name: Maxsim Precision@3
217
+ - type: MaxSim_precision@5
218
+ value: 0.17200000000000004
219
+ name: Maxsim Precision@5
220
+ - type: MaxSim_precision@10
221
+ value: 0.10800000000000001
222
+ name: Maxsim Precision@10
223
+ - type: MaxSim_recall@1
224
+ value: 0.1770793650793651
225
+ name: Maxsim Recall@1
226
+ - type: MaxSim_recall@3
227
+ value: 0.3453492063492064
228
+ name: Maxsim Recall@3
229
+ - type: MaxSim_recall@5
230
+ value: 0.4009047619047619
231
+ name: Maxsim Recall@5
232
+ - type: MaxSim_recall@10
233
+ value: 0.4740952380952381
234
+ name: Maxsim Recall@10
235
+ - type: MaxSim_ndcg@10
236
+ value: 0.38709436118795515
237
+ name: Maxsim Ndcg@10
238
+ - type: MaxSim_mrr@10
239
+ value: 0.4288015873015872
240
+ name: Maxsim Mrr@10
241
+ - type: MaxSim_map@100
242
+ value: 0.3297000135708943
243
+ name: Maxsim Map@100
244
+ - task:
245
+ type: py-late-information-retrieval
246
+ name: Py Late Information Retrieval
247
+ dataset:
248
+ name: NanoHotpotQA
249
+ type: NanoHotpotQA
250
+ metrics:
251
+ - type: MaxSim_accuracy@1
252
+ value: 0.94
253
+ name: Maxsim Accuracy@1
254
+ - type: MaxSim_accuracy@3
255
+ value: 0.94
256
+ name: Maxsim Accuracy@3
257
+ - type: MaxSim_accuracy@5
258
+ value: 0.98
259
+ name: Maxsim Accuracy@5
260
+ - type: MaxSim_accuracy@10
261
+ value: 1.0
262
+ name: Maxsim Accuracy@10
263
+ - type: MaxSim_precision@1
264
+ value: 0.94
265
+ name: Maxsim Precision@1
266
+ - type: MaxSim_precision@3
267
+ value: 0.5
268
+ name: Maxsim Precision@3
269
+ - type: MaxSim_precision@5
270
+ value: 0.31200000000000006
271
+ name: Maxsim Precision@5
272
+ - type: MaxSim_precision@10
273
+ value: 0.16599999999999995
274
+ name: Maxsim Precision@10
275
+ - type: MaxSim_recall@1
276
+ value: 0.47
277
+ name: Maxsim Recall@1
278
+ - type: MaxSim_recall@3
279
+ value: 0.75
280
+ name: Maxsim Recall@3
281
+ - type: MaxSim_recall@5
282
+ value: 0.78
283
+ name: Maxsim Recall@5
284
+ - type: MaxSim_recall@10
285
+ value: 0.83
286
+ name: Maxsim Recall@10
287
+ - type: MaxSim_ndcg@10
288
+ value: 0.8179728241272247
289
+ name: Maxsim Ndcg@10
290
+ - type: MaxSim_mrr@10
291
+ value: 0.9512222222222222
292
+ name: Maxsim Mrr@10
293
+ - type: MaxSim_map@100
294
+ value: 0.7611883462001594
295
+ name: Maxsim Map@100
296
+ - task:
297
+ type: py-late-information-retrieval
298
+ name: Py Late Information Retrieval
299
+ dataset:
300
+ name: NanoMSMARCO
301
+ type: NanoMSMARCO
302
+ metrics:
303
+ - type: MaxSim_accuracy@1
304
+ value: 0.42
305
+ name: Maxsim Accuracy@1
306
+ - type: MaxSim_accuracy@3
307
+ value: 0.66
308
+ name: Maxsim Accuracy@3
309
+ - type: MaxSim_accuracy@5
310
+ value: 0.68
311
+ name: Maxsim Accuracy@5
312
+ - type: MaxSim_accuracy@10
313
+ value: 0.78
314
+ name: Maxsim Accuracy@10
315
+ - type: MaxSim_precision@1
316
+ value: 0.42
317
+ name: Maxsim Precision@1
318
+ - type: MaxSim_precision@3
319
+ value: 0.22
320
+ name: Maxsim Precision@3
321
+ - type: MaxSim_precision@5
322
+ value: 0.136
323
+ name: Maxsim Precision@5
324
+ - type: MaxSim_precision@10
325
+ value: 0.07800000000000001
326
+ name: Maxsim Precision@10
327
+ - type: MaxSim_recall@1
328
+ value: 0.42
329
+ name: Maxsim Recall@1
330
+ - type: MaxSim_recall@3
331
+ value: 0.66
332
+ name: Maxsim Recall@3
333
+ - type: MaxSim_recall@5
334
+ value: 0.68
335
+ name: Maxsim Recall@5
336
+ - type: MaxSim_recall@10
337
+ value: 0.78
338
+ name: Maxsim Recall@10
339
+ - type: MaxSim_ndcg@10
340
+ value: 0.5976880189340548
341
+ name: Maxsim Ndcg@10
342
+ - type: MaxSim_mrr@10
343
+ value: 0.5393809523809523
344
+ name: Maxsim Mrr@10
345
+ - type: MaxSim_map@100
346
+ value: 0.5531015913611822
347
+ name: Maxsim Map@100
348
+ - task:
349
+ type: py-late-information-retrieval
350
+ name: Py Late Information Retrieval
351
+ dataset:
352
+ name: NanoNFCorpus
353
+ type: NanoNFCorpus
354
+ metrics:
355
+ - type: MaxSim_accuracy@1
356
+ value: 0.46
357
+ name: Maxsim Accuracy@1
358
+ - type: MaxSim_accuracy@3
359
+ value: 0.58
360
+ name: Maxsim Accuracy@3
361
+ - type: MaxSim_accuracy@5
362
+ value: 0.62
363
+ name: Maxsim Accuracy@5
364
+ - type: MaxSim_accuracy@10
365
+ value: 0.68
366
+ name: Maxsim Accuracy@10
367
+ - type: MaxSim_precision@1
368
+ value: 0.46
369
+ name: Maxsim Precision@1
370
+ - type: MaxSim_precision@3
371
+ value: 0.38
372
+ name: Maxsim Precision@3
373
+ - type: MaxSim_precision@5
374
+ value: 0.324
375
+ name: Maxsim Precision@5
376
+ - type: MaxSim_precision@10
377
+ value: 0.272
378
+ name: Maxsim Precision@10
379
+ - type: MaxSim_recall@1
380
+ value: 0.04276439372638386
381
+ name: Maxsim Recall@1
382
+ - type: MaxSim_recall@3
383
+ value: 0.07977851865112022
384
+ name: Maxsim Recall@3
385
+ - type: MaxSim_recall@5
386
+ value: 0.11439841040272719
387
+ name: Maxsim Recall@5
388
+ - type: MaxSim_recall@10
389
+ value: 0.1391695106171535
390
+ name: Maxsim Recall@10
391
+ - type: MaxSim_ndcg@10
392
+ value: 0.34241148621124995
393
+ name: Maxsim Ndcg@10
394
+ - type: MaxSim_mrr@10
395
+ value: 0.5320000000000001
396
+ name: Maxsim Mrr@10
397
+ - type: MaxSim_map@100
398
+ value: 0.14897381866568696
399
+ name: Maxsim Map@100
400
+ - task:
401
+ type: py-late-information-retrieval
402
+ name: Py Late Information Retrieval
403
+ dataset:
404
+ name: NanoNQ
405
+ type: NanoNQ
406
+ metrics:
407
+ - type: MaxSim_accuracy@1
408
+ value: 0.42
409
+ name: Maxsim Accuracy@1
410
+ - type: MaxSim_accuracy@3
411
+ value: 0.68
412
+ name: Maxsim Accuracy@3
413
+ - type: MaxSim_accuracy@5
414
+ value: 0.74
415
+ name: Maxsim Accuracy@5
416
+ - type: MaxSim_accuracy@10
417
+ value: 0.84
418
+ name: Maxsim Accuracy@10
419
+ - type: MaxSim_precision@1
420
+ value: 0.42
421
+ name: Maxsim Precision@1
422
+ - type: MaxSim_precision@3
423
+ value: 0.23333333333333328
424
+ name: Maxsim Precision@3
425
+ - type: MaxSim_precision@5
426
+ value: 0.15200000000000002
427
+ name: Maxsim Precision@5
428
+ - type: MaxSim_precision@10
429
+ value: 0.086
430
+ name: Maxsim Precision@10
431
+ - type: MaxSim_recall@1
432
+ value: 0.4
433
+ name: Maxsim Recall@1
434
+ - type: MaxSim_recall@3
435
+ value: 0.66
436
+ name: Maxsim Recall@3
437
+ - type: MaxSim_recall@5
438
+ value: 0.72
439
+ name: Maxsim Recall@5
440
+ - type: MaxSim_recall@10
441
+ value: 0.79
442
+ name: Maxsim Recall@10
443
+ - type: MaxSim_ndcg@10
444
+ value: 0.6184738987111722
445
+ name: Maxsim Ndcg@10
446
+ - type: MaxSim_mrr@10
447
+ value: 0.5763888888888888
448
+ name: Maxsim Mrr@10
449
+ - type: MaxSim_map@100
450
+ value: 0.5642312927870203
451
+ name: Maxsim Map@100
452
+ - task:
453
+ type: py-late-information-retrieval
454
+ name: Py Late Information Retrieval
455
+ dataset:
456
+ name: NanoQuoraRetrieval
457
+ type: NanoQuoraRetrieval
458
+ metrics:
459
+ - type: MaxSim_accuracy@1
460
+ value: 0.8
461
+ name: Maxsim Accuracy@1
462
+ - type: MaxSim_accuracy@3
463
+ value: 0.92
464
+ name: Maxsim Accuracy@3
465
+ - type: MaxSim_accuracy@5
466
+ value: 0.94
467
+ name: Maxsim Accuracy@5
468
+ - type: MaxSim_accuracy@10
469
+ value: 0.96
470
+ name: Maxsim Accuracy@10
471
+ - type: MaxSim_precision@1
472
+ value: 0.8
473
+ name: Maxsim Precision@1
474
+ - type: MaxSim_precision@3
475
+ value: 0.3399999999999999
476
+ name: Maxsim Precision@3
477
+ - type: MaxSim_precision@5
478
+ value: 0.22399999999999998
479
+ name: Maxsim Precision@5
480
+ - type: MaxSim_precision@10
481
+ value: 0.11999999999999998
482
+ name: Maxsim Precision@10
483
+ - type: MaxSim_recall@1
484
+ value: 0.7239999999999999
485
+ name: Maxsim Recall@1
486
+ - type: MaxSim_recall@3
487
+ value: 0.8473333333333334
488
+ name: Maxsim Recall@3
489
+ - type: MaxSim_recall@5
490
+ value: 0.9006666666666666
491
+ name: Maxsim Recall@5
492
+ - type: MaxSim_recall@10
493
+ value: 0.9373333333333334
494
+ name: Maxsim Recall@10
495
+ - type: MaxSim_ndcg@10
496
+ value: 0.863105292852843
497
+ name: Maxsim Ndcg@10
498
+ - type: MaxSim_mrr@10
499
+ value: 0.8611904761904764
500
+ name: Maxsim Mrr@10
501
+ - type: MaxSim_map@100
502
+ value: 0.8312823701317842
503
+ name: Maxsim Map@100
504
+ - task:
505
+ type: py-late-information-retrieval
506
+ name: Py Late Information Retrieval
507
+ dataset:
508
+ name: NanoSCIDOCS
509
+ type: NanoSCIDOCS
510
+ metrics:
511
+ - type: MaxSim_accuracy@1
512
+ value: 0.42
513
+ name: Maxsim Accuracy@1
514
+ - type: MaxSim_accuracy@3
515
+ value: 0.58
516
+ name: Maxsim Accuracy@3
517
+ - type: MaxSim_accuracy@5
518
+ value: 0.64
519
+ name: Maxsim Accuracy@5
520
+ - type: MaxSim_accuracy@10
521
+ value: 0.7
522
+ name: Maxsim Accuracy@10
523
+ - type: MaxSim_precision@1
524
+ value: 0.42
525
+ name: Maxsim Precision@1
526
+ - type: MaxSim_precision@3
527
+ value: 0.2866666666666667
528
+ name: Maxsim Precision@3
529
+ - type: MaxSim_precision@5
530
+ value: 0.20799999999999996
531
+ name: Maxsim Precision@5
532
+ - type: MaxSim_precision@10
533
+ value: 0.138
534
+ name: Maxsim Precision@10
535
+ - type: MaxSim_recall@1
536
+ value: 0.085
537
+ name: Maxsim Recall@1
538
+ - type: MaxSim_recall@3
539
+ value: 0.17666666666666664
540
+ name: Maxsim Recall@3
541
+ - type: MaxSim_recall@5
542
+ value: 0.21366666666666667
543
+ name: Maxsim Recall@5
544
+ - type: MaxSim_recall@10
545
+ value: 0.2826666666666667
546
+ name: Maxsim Recall@10
547
+ - type: MaxSim_ndcg@10
548
+ value: 0.2889801789850345
549
+ name: Maxsim Ndcg@10
550
+ - type: MaxSim_mrr@10
551
+ value: 0.5005
552
+ name: Maxsim Mrr@10
553
+ - type: MaxSim_map@100
554
+ value: 0.21685607444339383
555
+ name: Maxsim Map@100
556
+ - task:
557
+ type: py-late-information-retrieval
558
+ name: Py Late Information Retrieval
559
+ dataset:
560
+ name: NanoArguAna
561
+ type: NanoArguAna
562
+ metrics:
563
+ - type: MaxSim_accuracy@1
564
+ value: 0.2
565
+ name: Maxsim Accuracy@1
566
+ - type: MaxSim_accuracy@3
567
+ value: 0.44
568
+ name: Maxsim Accuracy@3
569
+ - type: MaxSim_accuracy@5
570
+ value: 0.5
571
+ name: Maxsim Accuracy@5
572
+ - type: MaxSim_accuracy@10
573
+ value: 0.64
574
+ name: Maxsim Accuracy@10
575
+ - type: MaxSim_precision@1
576
+ value: 0.2
577
+ name: Maxsim Precision@1
578
+ - type: MaxSim_precision@3
579
+ value: 0.14666666666666664
580
+ name: Maxsim Precision@3
581
+ - type: MaxSim_precision@5
582
+ value: 0.1
583
+ name: Maxsim Precision@5
584
+ - type: MaxSim_precision@10
585
+ value: 0.064
586
+ name: Maxsim Precision@10
587
+ - type: MaxSim_recall@1
588
+ value: 0.2
589
+ name: Maxsim Recall@1
590
+ - type: MaxSim_recall@3
591
+ value: 0.44
592
+ name: Maxsim Recall@3
593
+ - type: MaxSim_recall@5
594
+ value: 0.5
595
+ name: Maxsim Recall@5
596
+ - type: MaxSim_recall@10
597
+ value: 0.64
598
+ name: Maxsim Recall@10
599
+ - type: MaxSim_ndcg@10
600
+ value: 0.4151392430544827
601
+ name: Maxsim Ndcg@10
602
+ - type: MaxSim_mrr@10
603
+ value: 0.3440555555555555
604
+ name: Maxsim Mrr@10
605
+ - type: MaxSim_map@100
606
+ value: 0.3521906424035335
607
+ name: Maxsim Map@100
608
+ - task:
609
+ type: py-late-information-retrieval
610
+ name: Py Late Information Retrieval
611
+ dataset:
612
+ name: NanoSciFact
613
+ type: NanoSciFact
614
+ metrics:
615
+ - type: MaxSim_accuracy@1
616
+ value: 0.58
617
+ name: Maxsim Accuracy@1
618
+ - type: MaxSim_accuracy@3
619
+ value: 0.76
620
+ name: Maxsim Accuracy@3
621
+ - type: MaxSim_accuracy@5
622
+ value: 0.82
623
+ name: Maxsim Accuracy@5
624
+ - type: MaxSim_accuracy@10
625
+ value: 0.86
626
+ name: Maxsim Accuracy@10
627
+ - type: MaxSim_precision@1
628
+ value: 0.58
629
+ name: Maxsim Precision@1
630
+ - type: MaxSim_precision@3
631
+ value: 0.2733333333333333
632
+ name: Maxsim Precision@3
633
+ - type: MaxSim_precision@5
634
+ value: 0.18
635
+ name: Maxsim Precision@5
636
+ - type: MaxSim_precision@10
637
+ value: 0.09399999999999999
638
+ name: Maxsim Precision@10
639
+ - type: MaxSim_recall@1
640
+ value: 0.555
641
+ name: Maxsim Recall@1
642
+ - type: MaxSim_recall@3
643
+ value: 0.735
644
+ name: Maxsim Recall@3
645
+ - type: MaxSim_recall@5
646
+ value: 0.8
647
+ name: Maxsim Recall@5
648
+ - type: MaxSim_recall@10
649
+ value: 0.84
650
+ name: Maxsim Recall@10
651
+ - type: MaxSim_ndcg@10
652
+ value: 0.7153590631749926
653
+ name: Maxsim Ndcg@10
654
+ - type: MaxSim_mrr@10
655
+ value: 0.6798333333333333
656
+ name: Maxsim Mrr@10
657
+ - type: MaxSim_map@100
658
+ value: 0.6760413640032285
659
+ name: Maxsim Map@100
660
+ - task:
661
+ type: py-late-information-retrieval
662
+ name: Py Late Information Retrieval
663
+ dataset:
664
+ name: NanoTouche2020
665
+ type: NanoTouche2020
666
+ metrics:
667
+ - type: MaxSim_accuracy@1
668
+ value: 0.7551020408163265
669
+ name: Maxsim Accuracy@1
670
+ - type: MaxSim_accuracy@3
671
+ value: 1.0
672
+ name: Maxsim Accuracy@3
673
+ - type: MaxSim_accuracy@5
674
+ value: 1.0
675
+ name: Maxsim Accuracy@5
676
+ - type: MaxSim_accuracy@10
677
+ value: 1.0
678
+ name: Maxsim Accuracy@10
679
+ - type: MaxSim_precision@1
680
+ value: 0.7551020408163265
681
+ name: Maxsim Precision@1
682
+ - type: MaxSim_precision@3
683
+ value: 0.6734693877551019
684
+ name: Maxsim Precision@3
685
+ - type: MaxSim_precision@5
686
+ value: 0.6000000000000001
687
+ name: Maxsim Precision@5
688
+ - type: MaxSim_precision@10
689
+ value: 0.5285714285714286
690
+ name: Maxsim Precision@10
691
+ - type: MaxSim_recall@1
692
+ value: 0.050375728116040484
693
+ name: Maxsim Recall@1
694
+ - type: MaxSim_recall@3
695
+ value: 0.13379303377518686
696
+ name: Maxsim Recall@3
697
+ - type: MaxSim_recall@5
698
+ value: 0.19744749683082305
699
+ name: Maxsim Recall@5
700
+ - type: MaxSim_recall@10
701
+ value: 0.3328396127707909
702
+ name: Maxsim Recall@10
703
+ - type: MaxSim_ndcg@10
704
+ value: 0.5927407647152685
705
+ name: Maxsim Ndcg@10
706
+ - type: MaxSim_mrr@10
707
+ value: 0.8639455782312924
708
+ name: Maxsim Mrr@10
709
+ - type: MaxSim_map@100
710
+ value: 0.4115661843314275
711
+ name: Maxsim Map@100
712
+ - task:
713
+ type: nano-beir
714
+ name: Nano BEIR
715
+ dataset:
716
+ name: NanoBEIR mean
717
+ type: NanoBEIR_mean
718
+ metrics:
719
+ - type: MaxSim_accuracy@1
720
+ value: 0.5350078492935635
721
+ name: Maxsim Accuracy@1
722
+ - type: MaxSim_accuracy@3
723
+ value: 0.7000000000000001
724
+ name: Maxsim Accuracy@3
725
+ - type: MaxSim_accuracy@5
726
+ value: 0.743076923076923
727
+ name: Maxsim Accuracy@5
728
+ - type: MaxSim_accuracy@10
729
+ value: 0.8107692307692307
730
+ name: Maxsim Accuracy@10
731
+ - type: MaxSim_precision@1
732
+ value: 0.5350078492935635
733
+ name: Maxsim Precision@1
734
+ - type: MaxSim_precision@3
735
+ value: 0.33026687598116167
736
+ name: Maxsim Precision@3
737
+ - type: MaxSim_precision@5
738
+ value: 0.24923076923076928
739
+ name: Maxsim Precision@5
740
+ - type: MaxSim_precision@10
741
+ value: 0.1795824175824176
742
+ name: Maxsim Precision@10
743
+ - type: MaxSim_recall@1
744
+ value: 0.3058804107585258
745
+ name: Maxsim Recall@1
746
+ - type: MaxSim_recall@3
747
+ value: 0.45537049602453755
748
+ name: Maxsim Recall@3
749
+ - type: MaxSim_recall@5
750
+ value: 0.5031511327862271
751
+ name: Maxsim Recall@5
752
+ - type: MaxSim_recall@10
753
+ value: 0.5851700658324468
754
+ name: Maxsim Recall@10
755
+ - type: MaxSim_ndcg@10
756
+ value: 0.5599166277934665
757
+ name: Maxsim Ndcg@10
758
+ - type: MaxSim_mrr@10
759
+ value: 0.6282656549799407
760
+ name: Maxsim Mrr@10
761
+ - type: MaxSim_map@100
762
+ value: 0.4817505346586929
763
+ name: Maxsim Map@100
764
+ ---
765
+
766
+ # ColBERT MUVERA Micro
767
+
768
+ This is a [PyLate](https://github.com/lightonai/pylate) model finetuned from [google/bert_uncased_L-2_H-128_A-2](https://huggingface.co/google/bert_uncased_L-2_H-128_A-2) on the [msmarco-en-bge-gemma-unnormalized](https://huggingface.co/datasets/lightonai/ms-marco-en-bge-gemma-unnormalized) dataset. It maps sentences & paragraphs to sequences of 128-dimensional dense vectors and can be used for semantic textual similarity using the MaxSim operator.
769
+
770
+ This model is trained with un-normalized scores, making it compatible with [MUVERA fixed-dimensional encoding](https://arxiv.org/abs/2405.19504).
771
+
772
+ ## Usage (txtai)
773
+
774
+ This model can be used to build embeddings databases with [txtai](https://github.com/neuml/txtai) for semantic search and/or as a knowledge source for retrieval augmented generation (RAG).
775
+
776
+ _Note: txtai 9.0+ is required for late interaction model support_
777
+
778
+ ```python
779
+ import txtai
780
+
781
+ embeddings = txtai.Embeddings(
782
+ sparse="neuml/colbert-muvera-micro",
783
+ content=True
784
+ )
785
+ embeddings.index(documents())
786
+
787
+ # Run a query
788
+ embeddings.search("query to run")
789
+ ```
790
+
791
+ Late interaction models excel as reranker pipelines.
792
+
793
+ ```python
794
+ from txtai.pipeline import Reranker, Similarity
795
+
796
+ similarity = Similarity(path="neuml/colbert-muvera-micro", lateencode=True)
797
+ ranker = Reranker(embeddings, similarity)
798
+ ranker("query to run")
799
+ ```
800
+
801
+ ## Usage (PyLate)
802
+
803
+ Alternatively, the model can be loaded with [PyLate](https://github.com/lightonai/pylate).
804
+
805
+ ```python
806
+ from pylate import rank, models
807
+
808
+ queries = [
809
+ "query A",
810
+ "query B",
811
+ ]
812
+
813
+ documents = [
814
+ ["document A", "document B"],
815
+ ["document 1", "document C", "document B"],
816
+ ]
817
+
818
+ documents_ids = [
819
+ [1, 2],
820
+ [1, 3, 2],
821
+ ]
822
+
823
+ model = models.ColBERT(
824
+ model_name_or_path="neuml/colbert-muvera-micro",
825
+ )
826
+
827
+ queries_embeddings = model.encode(
828
+ queries,
829
+ is_query=True,
830
+ )
831
+
832
+ documents_embeddings = model.encode(
833
+ documents,
834
+ is_query=False,
835
+ )
836
+
837
+ reranked_documents = rank.rerank(
838
+ documents_ids=documents_ids,
839
+ queries_embeddings=queries_embeddings,
840
+ documents_embeddings=documents_embeddings,
841
+ )
842
+ ```
843
+
844
+ ### Full Model Architecture
845
+
846
+ ```
847
+ ColBERT(
848
+ (0): Transformer({'max_seq_length': 299, 'do_lower_case': False}) with Transformer model: BertModel
849
+ (1): Dense({'in_features': 128, 'out_features': 128, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
850
+ )
851
+ ```
852
+
853
+ ## Evaluation
854
+
855
+ ### BEIR Subset
856
+
857
+ The following table shows a subset of BEIR scored with the [txtai benchmarks script](https://github.com/neuml/txtai/blob/master/examples/benchmarks.py).
858
+
859
+ Scores reported are `ndcg@10` and grouped into the following three categories.
860
+
861
+ #### FULL multi-vector maxsim
862
+
863
+ | Model | Parameters | ArguAna | NFCorpus | SciFact | Average |
864
+ |:------------------|:-----------|:---------|:---------|:--------|:--------|
865
+ | [AnswerAI ColBERT Small v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1) | 33M | 0.4440 | 0.3649 | 0.7423 | 0.5171 |
866
+ | [ColBERT v2](https://huggingface.co/colbert-ir/colbertv2.0) | 110M | 0.4595 | 0.3165 | 0.6456 | 0.4739 |
867
+ | [**ColBERT MUVERA Micro**](https://huggingface.co/neuml/colbert-muvera-micro) | **4M** | **0.3947** | **0.3235** | **0.6676** | **0.4619** |
868
+ | [ColBERT MUVERA Small](https://huggingface.co/neuml/colbert-muvera-small) | 33M | 0.4455 | 0.3502 | 0.7145 | 0.5034 |
869
+ | [GTE ModernColBERT v1](https://huggingface.co/lightonai/GTE-ModernColBERT-v1) | 149M | 0.4946 | 0.3717 | 0.7529 | 0.5397 |
870
+
871
+ #### MUVERA encoding + maxsim re-ranking of the top 100 results per MUVERA paper
872
+
873
+ | Model | Parameters | ArguAna | NFCorpus | SciFact | Average |
874
+ |:------------------|:-----------|:---------|:---------|:--------|:--------|
875
+ | [AnswerAI ColBERT Small v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1) | 33M | 0.0317 | 0.1135 | 0.0836 | 0.0763 |
876
+ | [ColBERT v2](https://huggingface.co/colbert-ir/colbertv2.0) | 110M | 0.4562 | 0.3025 | 0.6278 | 0.4622 |
877
+ | [**ColBERT MUVERA Micro**](https://huggingface.co/neuml/colbert-muvera-micro) | **4M** | **0.3849** | **0.3095** | **0.6464** | **0.4469** |
878
+ | [ColBERT MUVERA Small](https://huggingface.co/neuml/colbert-muvera-small) | 33M | 0.4451 | 0.3537 | 0.7148 | 0.5045 |
879
+ | [GTE ModernColBERT v1](https://huggingface.co/lightonai/GTE-ModernColBERT-v1) | 149M | 0.0265 | 0.1052 | 0.0556 | 0.0624 |
880
+
881
+ #### MUVERA encoding only
882
+
883
+ | Model | Parameters | ArguAna | NFCorpus | SciFact | Average |
884
+ |:------------------|:-----------|:---------|:---------|:--------|:--------|
885
+ | [AnswerAI ColBERT Small v1](https://huggingface.co/answerdotai/answerai-colbert-small-v1) | 33M | 0.0024 | 0.0201 | 0.0047 | 0.0091 |
886
+ | [ColBERT v2](https://huggingface.co/colbert-ir/colbertv2.0) | 110M | 0.3463 | 0.2356 | 0.5002 | 0.3607 |
887
+ | [**ColBERT MUVERA Micro**](https://huggingface.co/neuml/colbert-muvera-micro) | **4M** | **0.2795** | **0.2348** | **0.4875** | **0.3339** |
888
+ | [ColBERT MUVERA Small](https://huggingface.co/neuml/colbert-muvera-small) | 33M | 0.3850 | 0.2928 | 0.6357 | 0.4378 |
889
+ | [GTE ModernColBERT v1](https://huggingface.co/lightonai/GTE-ModernColBERT-v1) | 149M | 0.0003 | 0.0203 |0.0013 | 0.0073 |
890
+
891
+ _Note: The scores reported don't match scores reported in the respective papers due to different default settings in the txtai benchmark scripts._
892
+
893
+ As noted earlier, models trained with min-max score normalization don't perform well with MUVERA encoding. See this [GitHub Issue](https://github.com/lightonai/pylate/issues/142) for more.
894
+
895
+ **In reviewing the scores, this model is surprisingly and unreasonably competitive with the original ColBERT v2 model at only 3% of the size!**
896
+
897
+ ### Nano BEIR
898
+ * Dataset: `NanoBEIR_mean`
899
+ * Evaluated with <code>pylate.evaluation.nano_beir_evaluator.NanoBEIREvaluator</code>
900
+
901
+ | Metric | Value |
902
+ |:--------------------|:-----------|
903
+ | MaxSim_accuracy@1 | 0.535 |
904
+ | MaxSim_accuracy@3 | 0.7 |
905
+ | MaxSim_accuracy@5 | 0.7431 |
906
+ | MaxSim_accuracy@10 | 0.8108 |
907
+ | MaxSim_precision@1 | 0.535 |
908
+ | MaxSim_precision@3 | 0.3303 |
909
+ | MaxSim_precision@5 | 0.2492 |
910
+ | MaxSim_precision@10 | 0.1796 |
911
+ | MaxSim_recall@1 | 0.3059 |
912
+ | MaxSim_recall@3 | 0.4554 |
913
+ | MaxSim_recall@5 | 0.5032 |
914
+ | MaxSim_recall@10 | 0.5852 |
915
+ | **MaxSim_ndcg@10** | **0.5599** |
916
+ | MaxSim_mrr@10 | 0.6283 |
917
+ | MaxSim_map@100 | 0.4818 |
918
+
919
+ ## Training Details
920
+
921
+ ### Training Hyperparameters
922
+
923
+ #### Non-Default Hyperparameters
924
+
925
+ - `eval_strategy`: steps
926
+ - `per_device_train_batch_size`: 32
927
+ - `learning_rate`: 0.0003
928
+ - `num_train_epochs`: 1
929
+ - `warmup_ratio`: 0.05
930
+ - `bf16`: True
931
+
932
+ #### All Hyperparameters
933
+ <details><summary>Click to expand</summary>
934
+
935
+ - `overwrite_output_dir`: False
936
+ - `do_predict`: False
937
+ - `eval_strategy`: steps
938
+ - `prediction_loss_only`: True
939
+ - `per_device_train_batch_size`: 32
940
+ - `per_device_eval_batch_size`: 8
941
+ - `per_gpu_train_batch_size`: None
942
+ - `per_gpu_eval_batch_size`: None
943
+ - `gradient_accumulation_steps`: 1
944
+ - `eval_accumulation_steps`: None
945
+ - `torch_empty_cache_steps`: None
946
+ - `learning_rate`: 0.0003
947
+ - `weight_decay`: 0.0
948
+ - `adam_beta1`: 0.9
949
+ - `adam_beta2`: 0.999
950
+ - `adam_epsilon`: 1e-08
951
+ - `max_grad_norm`: 1.0
952
+ - `num_train_epochs`: 1
953
+ - `max_steps`: -1
954
+ - `lr_scheduler_type`: linear
955
+ - `lr_scheduler_kwargs`: {}
956
+ - `warmup_ratio`: 0.05
957
+ - `warmup_steps`: 0
958
+ - `log_level`: passive
959
+ - `log_level_replica`: warning
960
+ - `log_on_each_node`: True
961
+ - `logging_nan_inf_filter`: True
962
+ - `save_safetensors`: True
963
+ - `save_on_each_node`: False
964
+ - `save_only_model`: False
965
+ - `restore_callback_states_from_checkpoint`: False
966
+ - `no_cuda`: False
967
+ - `use_cpu`: False
968
+ - `use_mps_device`: False
969
+ - `seed`: 42
970
+ - `data_seed`: None
971
+ - `jit_mode_eval`: False
972
+ - `use_ipex`: False
973
+ - `bf16`: True
974
+ - `fp16`: False
975
+ - `fp16_opt_level`: O1
976
+ - `half_precision_backend`: auto
977
+ - `bf16_full_eval`: False
978
+ - `fp16_full_eval`: False
979
+ - `tf32`: None
980
+ - `local_rank`: 0
981
+ - `ddp_backend`: None
982
+ - `tpu_num_cores`: None
983
+ - `tpu_metrics_debug`: False
984
+ - `debug`: []
985
+ - `dataloader_drop_last`: False
986
+ - `dataloader_num_workers`: 0
987
+ - `dataloader_prefetch_factor`: None
988
+ - `past_index`: -1
989
+ - `disable_tqdm`: False
990
+ - `remove_unused_columns`: True
991
+ - `label_names`: None
992
+ - `load_best_model_at_end`: False
993
+ - `ignore_data_skip`: False
994
+ - `fsdp`: []
995
+ - `fsdp_min_num_params`: 0
996
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
997
+ - `fsdp_transformer_layer_cls_to_wrap`: None
998
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
999
+ - `deepspeed`: None
1000
+ - `label_smoothing_factor`: 0.0
1001
+ - `optim`: adamw_torch
1002
+ - `optim_args`: None
1003
+ - `adafactor`: False
1004
+ - `group_by_length`: False
1005
+ - `length_column_name`: length
1006
+ - `ddp_find_unused_parameters`: None
1007
+ - `ddp_bucket_cap_mb`: None
1008
+ - `ddp_broadcast_buffers`: False
1009
+ - `dataloader_pin_memory`: True
1010
+ - `dataloader_persistent_workers`: False
1011
+ - `skip_memory_metrics`: True
1012
+ - `use_legacy_prediction_loop`: False
1013
+ - `push_to_hub`: False
1014
+ - `resume_from_checkpoint`: None
1015
+ - `hub_model_id`: None
1016
+ - `hub_strategy`: every_save
1017
+ - `hub_private_repo`: None
1018
+ - `hub_always_push`: False
1019
+ - `gradient_checkpointing`: False
1020
+ - `gradient_checkpointing_kwargs`: None
1021
+ - `include_inputs_for_metrics`: False
1022
+ - `include_for_metrics`: []
1023
+ - `eval_do_concat_batches`: True
1024
+ - `fp16_backend`: auto
1025
+ - `push_to_hub_model_id`: None
1026
+ - `push_to_hub_organization`: None
1027
+ - `mp_parameters`:
1028
+ - `auto_find_batch_size`: False
1029
+ - `full_determinism`: False
1030
+ - `torchdynamo`: None
1031
+ - `ray_scope`: last
1032
+ - `ddp_timeout`: 1800
1033
+ - `torch_compile`: False
1034
+ - `torch_compile_backend`: None
1035
+ - `torch_compile_mode`: None
1036
+ - `include_tokens_per_second`: False
1037
+ - `include_num_input_tokens_seen`: False
1038
+ - `neftune_noise_alpha`: None
1039
+ - `optim_target_modules`: None
1040
+ - `batch_eval_metrics`: False
1041
+ - `eval_on_start`: False
1042
+ - `use_liger_kernel`: False
1043
+ - `eval_use_gather_object`: False
1044
+ - `average_tokens_across_devices`: False
1045
+ - `prompts`: None
1046
+ - `batch_sampler`: batch_sampler
1047
+ - `multi_dataset_batch_sampler`: proportional
1048
+
1049
+ </details>
1050
+
1051
+ ### Framework Versions
1052
+ - Python: 3.10.18
1053
+ - Sentence Transformers: 4.0.2
1054
+ - PyLate: 1.3.0
1055
+ - Transformers: 4.52.3
1056
+ - PyTorch: 2.8.0+cu128
1057
+ - Accelerate: 1.10.1
1058
+ - Datasets: 4.0.0
1059
+ - Tokenizers: 0.21.4
1060
+
1061
+ ## Citation
1062
+
1063
+ ### BibTeX
1064
+
1065
+ #### Sentence Transformers
1066
+ ```bibtex
1067
+ @inproceedings{reimers-2019-sentence-bert,
1068
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
1069
+ author = "Reimers, Nils and Gurevych, Iryna",
1070
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
1071
+ month = "11",
1072
+ year = "2019",
1073
+ publisher = "Association for Computational Linguistics",
1074
+ url = "https://arxiv.org/abs/1908.10084"
1075
+ }
1076
+ ```
1077
+
1078
+ #### PyLate
1079
+ ```bibtex
1080
+ @misc{PyLate,
1081
+ title={PyLate: Flexible Training and Retrieval for Late Interaction Models},
1082
+ author={Chaffin, Antoine and Sourty, Raphaël},
1083
+ url={https://github.com/lightonai/pylate},
1084
+ year={2024}
1085
+ }
1086
+ ```
added_tokens.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "[D] ": 30523,
3
+ "[Q] ": 30522
4
+ }
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "hidden_act": "gelu",
8
+ "hidden_dropout_prob": 0.1,
9
+ "hidden_size": 128,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 512,
12
+ "layer_norm_eps": 1e-12,
13
+ "max_position_embeddings": 512,
14
+ "model_type": "bert",
15
+ "num_attention_heads": 2,
16
+ "num_hidden_layers": 2,
17
+ "pad_token_id": 0,
18
+ "position_embedding_type": "absolute",
19
+ "torch_dtype": "float32",
20
+ "transformers_version": "4.52.3",
21
+ "type_vocab_size": 2,
22
+ "use_cache": true,
23
+ "vocab_size": 30524
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "4.0.2",
4
+ "transformers": "4.52.3",
5
+ "pytorch": "2.8.0+cu128"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "MaxSim",
10
+ "query_prefix": "[Q] ",
11
+ "document_prefix": "[D] ",
12
+ "query_length": 32,
13
+ "document_length": 300,
14
+ "attend_to_expansion_tokens": false,
15
+ "skiplist_words": [
16
+ "!",
17
+ "\"",
18
+ "#",
19
+ "$",
20
+ "%",
21
+ "&",
22
+ "'",
23
+ "(",
24
+ ")",
25
+ "*",
26
+ "+",
27
+ ",",
28
+ "-",
29
+ ".",
30
+ "/",
31
+ ":",
32
+ ";",
33
+ "<",
34
+ "=",
35
+ ">",
36
+ "?",
37
+ "@",
38
+ "[",
39
+ "\\",
40
+ "]",
41
+ "^",
42
+ "_",
43
+ "`",
44
+ "{",
45
+ "|",
46
+ "}",
47
+ "~"
48
+ ],
49
+ "do_query_expansion": true
50
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be7739300d0ea06e2463db6c50f2e790d25d4f629493fc68dd1e1a2a376f04e8
3
+ size 17548936
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Dense",
12
+ "type": "pylate.models.Dense.Dense"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 299,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[MASK]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30522": {
44
+ "content": "[Q] ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "30523": {
52
+ "content": "[D] ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ }
59
+ },
60
+ "clean_up_tokenization_spaces": true,
61
+ "cls_token": "[CLS]",
62
+ "do_basic_tokenize": true,
63
+ "do_lower_case": true,
64
+ "extra_special_tokens": {},
65
+ "mask_token": "[MASK]",
66
+ "model_max_length": 1000000000000000019884624838656,
67
+ "never_split": null,
68
+ "pad_token": "[MASK]",
69
+ "sep_token": "[SEP]",
70
+ "strip_accents": null,
71
+ "tokenize_chinese_chars": true,
72
+ "tokenizer_class": "BertTokenizer",
73
+ "unk_token": "[UNK]"
74
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff