juanpablomesa commited on
Commit
1e3dede
·
verified ·
1 Parent(s): 3fbb590

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,810 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/all-mpnet-base-v2
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:4012
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: 'Extensive messenger RNA editing generates transcript and protein
35
+ diversity in genes involved in neural excitability, as previously described, as
36
+ well as in genes participating in a broad range of other cellular functions. '
37
+ sentences:
38
+ - Do cephalopods use RNA editing less frequently than other species?
39
+ - GV1001 vaccine targets which enzyme?
40
+ - Which event results in the acetylation of S6K1?
41
+ - source_sentence: Yes, exposure to household furry pets influences the gut microbiota
42
+ of infants.
43
+ sentences:
44
+ - Can pets affect infant microbiomed?
45
+ - What is the mode of action of Thiazovivin?
46
+ - What are the effects of CAMK4 inhibition?
47
+ - source_sentence: "In children with heart failure evidence of the effect of enalapril\
48
+ \ is empirical. Enalapril was clinically safe and effective in 50% to 80% of for\
49
+ \ children with cardiac failure secondary to congenital heart malformations before\
50
+ \ and after cardiac surgery, impaired ventricular function , valvar regurgitation,\
51
+ \ congestive cardiomyopathy, , arterial hypertension, life-threatening arrhythmias\
52
+ \ coexisting with circulatory insufficiency. \nACE inhibitors have shown a transient\
53
+ \ beneficial effect on heart failure due to anticancer drugs and possibly a beneficial\
54
+ \ effect in muscular dystrophy-associated cardiomyopathy, which deserves further\
55
+ \ studies."
56
+ sentences:
57
+ - Which receptors can be evaluated with the [18F]altanserin?
58
+ - In what proportion of children with heart failure has Enalapril been shown to
59
+ be safe and effective?
60
+ - Which major signaling pathways are regulated by RIP1?
61
+ - source_sentence: Cellular senescence-associated heterochromatic foci (SAHFS) are
62
+ a novel type of chromatin condensation involving alterations of linker histone
63
+ H1 and linker DNA-binding proteins. SAHFS can be formed by a variety of cell types,
64
+ but their mechanism of action remains unclear.
65
+ sentences:
66
+ - What is the relationship between the X chromosome and a neutrophil drumstick?
67
+ - Which microRNAs are involved in exercise adaptation?
68
+ - How are SAHFS created?
69
+ - source_sentence: Multicluster Pcdh diversity is required for mouse olfactory neural
70
+ circuit assembly. The vertebrate clustered protocadherin (Pcdh) cell surface proteins
71
+ are encoded by three closely linked gene clusters (Pcdhα, Pcdhβ, and Pcdhγ). Although
72
+ deletion of individual Pcdh clusters had subtle phenotypic consequences, the loss
73
+ of all three clusters (tricluster deletion) led to a severe axonal arborization
74
+ defect and loss of self-avoidance.
75
+ sentences:
76
+ - What are the effects of the deletion of all three Pcdh clusters (tricluster deletion)
77
+ in mice?
78
+ - what is the role of MEF-2 in cardiomyocyte differentiation?
79
+ - How many periods of regulatory innovation led to the evolution of vertebrates?
80
+ model-index:
81
+ - name: BGE base Financial Matryoshka
82
+ results:
83
+ - task:
84
+ type: information-retrieval
85
+ name: Information Retrieval
86
+ dataset:
87
+ name: dim 768
88
+ type: dim_768
89
+ metrics:
90
+ - type: cosine_accuracy@1
91
+ value: 0.8373408769448374
92
+ name: Cosine Accuracy@1
93
+ - type: cosine_accuracy@3
94
+ value: 0.9306930693069307
95
+ name: Cosine Accuracy@3
96
+ - type: cosine_accuracy@5
97
+ value: 0.9448373408769448
98
+ name: Cosine Accuracy@5
99
+ - type: cosine_accuracy@10
100
+ value: 0.958981612446959
101
+ name: Cosine Accuracy@10
102
+ - type: cosine_precision@1
103
+ value: 0.8373408769448374
104
+ name: Cosine Precision@1
105
+ - type: cosine_precision@3
106
+ value: 0.31023102310231027
107
+ name: Cosine Precision@3
108
+ - type: cosine_precision@5
109
+ value: 0.18896746817538893
110
+ name: Cosine Precision@5
111
+ - type: cosine_precision@10
112
+ value: 0.09589816124469587
113
+ name: Cosine Precision@10
114
+ - type: cosine_recall@1
115
+ value: 0.8373408769448374
116
+ name: Cosine Recall@1
117
+ - type: cosine_recall@3
118
+ value: 0.9306930693069307
119
+ name: Cosine Recall@3
120
+ - type: cosine_recall@5
121
+ value: 0.9448373408769448
122
+ name: Cosine Recall@5
123
+ - type: cosine_recall@10
124
+ value: 0.958981612446959
125
+ name: Cosine Recall@10
126
+ - type: cosine_ndcg@10
127
+ value: 0.9038566618329213
128
+ name: Cosine Ndcg@10
129
+ - type: cosine_mrr@10
130
+ value: 0.8855380436002787
131
+ name: Cosine Mrr@10
132
+ - type: cosine_map@100
133
+ value: 0.8867903631779396
134
+ name: Cosine Map@100
135
+ - task:
136
+ type: information-retrieval
137
+ name: Information Retrieval
138
+ dataset:
139
+ name: dim 512
140
+ type: dim_512
141
+ metrics:
142
+ - type: cosine_accuracy@1
143
+ value: 0.8373408769448374
144
+ name: Cosine Accuracy@1
145
+ - type: cosine_accuracy@3
146
+ value: 0.9335219236209336
147
+ name: Cosine Accuracy@3
148
+ - type: cosine_accuracy@5
149
+ value: 0.9462517680339463
150
+ name: Cosine Accuracy@5
151
+ - type: cosine_accuracy@10
152
+ value: 0.9603960396039604
153
+ name: Cosine Accuracy@10
154
+ - type: cosine_precision@1
155
+ value: 0.8373408769448374
156
+ name: Cosine Precision@1
157
+ - type: cosine_precision@3
158
+ value: 0.31117397454031115
159
+ name: Cosine Precision@3
160
+ - type: cosine_precision@5
161
+ value: 0.18925035360678924
162
+ name: Cosine Precision@5
163
+ - type: cosine_precision@10
164
+ value: 0.09603960396039603
165
+ name: Cosine Precision@10
166
+ - type: cosine_recall@1
167
+ value: 0.8373408769448374
168
+ name: Cosine Recall@1
169
+ - type: cosine_recall@3
170
+ value: 0.9335219236209336
171
+ name: Cosine Recall@3
172
+ - type: cosine_recall@5
173
+ value: 0.9462517680339463
174
+ name: Cosine Recall@5
175
+ - type: cosine_recall@10
176
+ value: 0.9603960396039604
177
+ name: Cosine Recall@10
178
+ - type: cosine_ndcg@10
179
+ value: 0.9045496377971035
180
+ name: Cosine Ndcg@10
181
+ - type: cosine_mrr@10
182
+ value: 0.8860549830493253
183
+ name: Cosine Mrr@10
184
+ - type: cosine_map@100
185
+ value: 0.8870969130410834
186
+ name: Cosine Map@100
187
+ - task:
188
+ type: information-retrieval
189
+ name: Information Retrieval
190
+ dataset:
191
+ name: dim 256
192
+ type: dim_256
193
+ metrics:
194
+ - type: cosine_accuracy@1
195
+ value: 0.8288543140028288
196
+ name: Cosine Accuracy@1
197
+ - type: cosine_accuracy@3
198
+ value: 0.9222065063649222
199
+ name: Cosine Accuracy@3
200
+ - type: cosine_accuracy@5
201
+ value: 0.942008486562942
202
+ name: Cosine Accuracy@5
203
+ - type: cosine_accuracy@10
204
+ value: 0.9533239038189534
205
+ name: Cosine Accuracy@10
206
+ - type: cosine_precision@1
207
+ value: 0.8288543140028288
208
+ name: Cosine Precision@1
209
+ - type: cosine_precision@3
210
+ value: 0.3074021687883074
211
+ name: Cosine Precision@3
212
+ - type: cosine_precision@5
213
+ value: 0.18840169731258838
214
+ name: Cosine Precision@5
215
+ - type: cosine_precision@10
216
+ value: 0.09533239038189532
217
+ name: Cosine Precision@10
218
+ - type: cosine_recall@1
219
+ value: 0.8288543140028288
220
+ name: Cosine Recall@1
221
+ - type: cosine_recall@3
222
+ value: 0.9222065063649222
223
+ name: Cosine Recall@3
224
+ - type: cosine_recall@5
225
+ value: 0.942008486562942
226
+ name: Cosine Recall@5
227
+ - type: cosine_recall@10
228
+ value: 0.9533239038189534
229
+ name: Cosine Recall@10
230
+ - type: cosine_ndcg@10
231
+ value: 0.8963408137245359
232
+ name: Cosine Ndcg@10
233
+ - type: cosine_mrr@10
234
+ value: 0.8774370804427385
235
+ name: Cosine Mrr@10
236
+ - type: cosine_map@100
237
+ value: 0.8786914503856871
238
+ name: Cosine Map@100
239
+ - task:
240
+ type: information-retrieval
241
+ name: Information Retrieval
242
+ dataset:
243
+ name: dim 128
244
+ type: dim_128
245
+ metrics:
246
+ - type: cosine_accuracy@1
247
+ value: 0.809052333804809
248
+ name: Cosine Accuracy@1
249
+ - type: cosine_accuracy@3
250
+ value: 0.8995756718528995
251
+ name: Cosine Accuracy@3
252
+ - type: cosine_accuracy@5
253
+ value: 0.9207920792079208
254
+ name: Cosine Accuracy@5
255
+ - type: cosine_accuracy@10
256
+ value: 0.9405940594059405
257
+ name: Cosine Accuracy@10
258
+ - type: cosine_precision@1
259
+ value: 0.809052333804809
260
+ name: Cosine Precision@1
261
+ - type: cosine_precision@3
262
+ value: 0.29985855728429983
263
+ name: Cosine Precision@3
264
+ - type: cosine_precision@5
265
+ value: 0.18415841584158416
266
+ name: Cosine Precision@5
267
+ - type: cosine_precision@10
268
+ value: 0.09405940594059406
269
+ name: Cosine Precision@10
270
+ - type: cosine_recall@1
271
+ value: 0.809052333804809
272
+ name: Cosine Recall@1
273
+ - type: cosine_recall@3
274
+ value: 0.8995756718528995
275
+ name: Cosine Recall@3
276
+ - type: cosine_recall@5
277
+ value: 0.9207920792079208
278
+ name: Cosine Recall@5
279
+ - type: cosine_recall@10
280
+ value: 0.9405940594059405
281
+ name: Cosine Recall@10
282
+ - type: cosine_ndcg@10
283
+ value: 0.8794609712523561
284
+ name: Cosine Ndcg@10
285
+ - type: cosine_mrr@10
286
+ value: 0.8593930311398488
287
+ name: Cosine Mrr@10
288
+ - type: cosine_map@100
289
+ value: 0.8608652296821839
290
+ name: Cosine Map@100
291
+ - task:
292
+ type: information-retrieval
293
+ name: Information Retrieval
294
+ dataset:
295
+ name: dim 64
296
+ type: dim_64
297
+ metrics:
298
+ - type: cosine_accuracy@1
299
+ value: 0.7694483734087695
300
+ name: Cosine Accuracy@1
301
+ - type: cosine_accuracy@3
302
+ value: 0.8613861386138614
303
+ name: Cosine Accuracy@3
304
+ - type: cosine_accuracy@5
305
+ value: 0.8868458274398868
306
+ name: Cosine Accuracy@5
307
+ - type: cosine_accuracy@10
308
+ value: 0.9080622347949081
309
+ name: Cosine Accuracy@10
310
+ - type: cosine_precision@1
311
+ value: 0.7694483734087695
312
+ name: Cosine Precision@1
313
+ - type: cosine_precision@3
314
+ value: 0.2871287128712871
315
+ name: Cosine Precision@3
316
+ - type: cosine_precision@5
317
+ value: 0.17736916548797735
318
+ name: Cosine Precision@5
319
+ - type: cosine_precision@10
320
+ value: 0.09080622347949079
321
+ name: Cosine Precision@10
322
+ - type: cosine_recall@1
323
+ value: 0.7694483734087695
324
+ name: Cosine Recall@1
325
+ - type: cosine_recall@3
326
+ value: 0.8613861386138614
327
+ name: Cosine Recall@3
328
+ - type: cosine_recall@5
329
+ value: 0.8868458274398868
330
+ name: Cosine Recall@5
331
+ - type: cosine_recall@10
332
+ value: 0.9080622347949081
333
+ name: Cosine Recall@10
334
+ - type: cosine_ndcg@10
335
+ value: 0.841605620432732
336
+ name: Cosine Ndcg@10
337
+ - type: cosine_mrr@10
338
+ value: 0.8200012348173592
339
+ name: Cosine Mrr@10
340
+ - type: cosine_map@100
341
+ value: 0.8223782042287946
342
+ name: Cosine Map@100
343
+ ---
344
+
345
+ # BGE base Financial Matryoshka
346
+
347
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
348
+
349
+ ## Model Details
350
+
351
+ ### Model Description
352
+ - **Model Type:** Sentence Transformer
353
+ - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision 84f2bcc00d77236f9e89c8a360a00fb1139bf47d -->
354
+ - **Maximum Sequence Length:** 384 tokens
355
+ - **Output Dimensionality:** 768 tokens
356
+ - **Similarity Function:** Cosine Similarity
357
+ <!-- - **Training Dataset:** Unknown -->
358
+ - **Language:** en
359
+ - **License:** apache-2.0
360
+
361
+ ### Model Sources
362
+
363
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
364
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
365
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
366
+
367
+ ### Full Model Architecture
368
+
369
+ ```
370
+ SentenceTransformer(
371
+ (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
372
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
373
+ (2): Normalize()
374
+ )
375
+ ```
376
+
377
+ ## Usage
378
+
379
+ ### Direct Usage (Sentence Transformers)
380
+
381
+ First install the Sentence Transformers library:
382
+
383
+ ```bash
384
+ pip install -U sentence-transformers
385
+ ```
386
+
387
+ Then you can load this model and run inference.
388
+ ```python
389
+ from sentence_transformers import SentenceTransformer
390
+
391
+ # Download from the 🤗 Hub
392
+ model = SentenceTransformer("juanpablomesa/all-mpnet-base-v2-bioasq-matryoshka")
393
+ # Run inference
394
+ sentences = [
395
+ 'Multicluster Pcdh diversity is required for mouse olfactory neural circuit assembly. The vertebrate clustered protocadherin (Pcdh) cell surface proteins are encoded by three closely linked gene clusters (Pcdhα, Pcdhβ, and Pcdhγ). Although deletion of individual Pcdh clusters had subtle phenotypic consequences, the loss of all three clusters (tricluster deletion) led to a severe axonal arborization defect and loss of self-avoidance.',
396
+ 'What are the effects of the deletion of all three Pcdh clusters (tricluster deletion) in mice?',
397
+ 'How many periods of regulatory innovation led to the evolution of vertebrates?',
398
+ ]
399
+ embeddings = model.encode(sentences)
400
+ print(embeddings.shape)
401
+ # [3, 768]
402
+
403
+ # Get the similarity scores for the embeddings
404
+ similarities = model.similarity(embeddings, embeddings)
405
+ print(similarities.shape)
406
+ # [3, 3]
407
+ ```
408
+
409
+ <!--
410
+ ### Direct Usage (Transformers)
411
+
412
+ <details><summary>Click to see the direct usage in Transformers</summary>
413
+
414
+ </details>
415
+ -->
416
+
417
+ <!--
418
+ ### Downstream Usage (Sentence Transformers)
419
+
420
+ You can finetune this model on your own dataset.
421
+
422
+ <details><summary>Click to expand</summary>
423
+
424
+ </details>
425
+ -->
426
+
427
+ <!--
428
+ ### Out-of-Scope Use
429
+
430
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
431
+ -->
432
+
433
+ ## Evaluation
434
+
435
+ ### Metrics
436
+
437
+ #### Information Retrieval
438
+ * Dataset: `dim_768`
439
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
440
+
441
+ | Metric | Value |
442
+ |:--------------------|:-----------|
443
+ | cosine_accuracy@1 | 0.8373 |
444
+ | cosine_accuracy@3 | 0.9307 |
445
+ | cosine_accuracy@5 | 0.9448 |
446
+ | cosine_accuracy@10 | 0.959 |
447
+ | cosine_precision@1 | 0.8373 |
448
+ | cosine_precision@3 | 0.3102 |
449
+ | cosine_precision@5 | 0.189 |
450
+ | cosine_precision@10 | 0.0959 |
451
+ | cosine_recall@1 | 0.8373 |
452
+ | cosine_recall@3 | 0.9307 |
453
+ | cosine_recall@5 | 0.9448 |
454
+ | cosine_recall@10 | 0.959 |
455
+ | cosine_ndcg@10 | 0.9039 |
456
+ | cosine_mrr@10 | 0.8855 |
457
+ | **cosine_map@100** | **0.8868** |
458
+
459
+ #### Information Retrieval
460
+ * Dataset: `dim_512`
461
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
462
+
463
+ | Metric | Value |
464
+ |:--------------------|:-----------|
465
+ | cosine_accuracy@1 | 0.8373 |
466
+ | cosine_accuracy@3 | 0.9335 |
467
+ | cosine_accuracy@5 | 0.9463 |
468
+ | cosine_accuracy@10 | 0.9604 |
469
+ | cosine_precision@1 | 0.8373 |
470
+ | cosine_precision@3 | 0.3112 |
471
+ | cosine_precision@5 | 0.1893 |
472
+ | cosine_precision@10 | 0.096 |
473
+ | cosine_recall@1 | 0.8373 |
474
+ | cosine_recall@3 | 0.9335 |
475
+ | cosine_recall@5 | 0.9463 |
476
+ | cosine_recall@10 | 0.9604 |
477
+ | cosine_ndcg@10 | 0.9045 |
478
+ | cosine_mrr@10 | 0.8861 |
479
+ | **cosine_map@100** | **0.8871** |
480
+
481
+ #### Information Retrieval
482
+ * Dataset: `dim_256`
483
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
484
+
485
+ | Metric | Value |
486
+ |:--------------------|:-----------|
487
+ | cosine_accuracy@1 | 0.8289 |
488
+ | cosine_accuracy@3 | 0.9222 |
489
+ | cosine_accuracy@5 | 0.942 |
490
+ | cosine_accuracy@10 | 0.9533 |
491
+ | cosine_precision@1 | 0.8289 |
492
+ | cosine_precision@3 | 0.3074 |
493
+ | cosine_precision@5 | 0.1884 |
494
+ | cosine_precision@10 | 0.0953 |
495
+ | cosine_recall@1 | 0.8289 |
496
+ | cosine_recall@3 | 0.9222 |
497
+ | cosine_recall@5 | 0.942 |
498
+ | cosine_recall@10 | 0.9533 |
499
+ | cosine_ndcg@10 | 0.8963 |
500
+ | cosine_mrr@10 | 0.8774 |
501
+ | **cosine_map@100** | **0.8787** |
502
+
503
+ #### Information Retrieval
504
+ * Dataset: `dim_128`
505
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
506
+
507
+ | Metric | Value |
508
+ |:--------------------|:-----------|
509
+ | cosine_accuracy@1 | 0.8091 |
510
+ | cosine_accuracy@3 | 0.8996 |
511
+ | cosine_accuracy@5 | 0.9208 |
512
+ | cosine_accuracy@10 | 0.9406 |
513
+ | cosine_precision@1 | 0.8091 |
514
+ | cosine_precision@3 | 0.2999 |
515
+ | cosine_precision@5 | 0.1842 |
516
+ | cosine_precision@10 | 0.0941 |
517
+ | cosine_recall@1 | 0.8091 |
518
+ | cosine_recall@3 | 0.8996 |
519
+ | cosine_recall@5 | 0.9208 |
520
+ | cosine_recall@10 | 0.9406 |
521
+ | cosine_ndcg@10 | 0.8795 |
522
+ | cosine_mrr@10 | 0.8594 |
523
+ | **cosine_map@100** | **0.8609** |
524
+
525
+ #### Information Retrieval
526
+ * Dataset: `dim_64`
527
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
528
+
529
+ | Metric | Value |
530
+ |:--------------------|:-----------|
531
+ | cosine_accuracy@1 | 0.7694 |
532
+ | cosine_accuracy@3 | 0.8614 |
533
+ | cosine_accuracy@5 | 0.8868 |
534
+ | cosine_accuracy@10 | 0.9081 |
535
+ | cosine_precision@1 | 0.7694 |
536
+ | cosine_precision@3 | 0.2871 |
537
+ | cosine_precision@5 | 0.1774 |
538
+ | cosine_precision@10 | 0.0908 |
539
+ | cosine_recall@1 | 0.7694 |
540
+ | cosine_recall@3 | 0.8614 |
541
+ | cosine_recall@5 | 0.8868 |
542
+ | cosine_recall@10 | 0.9081 |
543
+ | cosine_ndcg@10 | 0.8416 |
544
+ | cosine_mrr@10 | 0.82 |
545
+ | **cosine_map@100** | **0.8224** |
546
+
547
+ <!--
548
+ ## Bias, Risks and Limitations
549
+
550
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
551
+ -->
552
+
553
+ <!--
554
+ ### Recommendations
555
+
556
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
557
+ -->
558
+
559
+ ## Training Details
560
+
561
+ ### Training Dataset
562
+
563
+ #### Unnamed Dataset
564
+
565
+
566
+ * Size: 4,012 training samples
567
+ * Columns: <code>positive</code> and <code>anchor</code>
568
+ * Approximate statistics based on the first 1000 samples:
569
+ | | positive | anchor |
570
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
571
+ | type | string | string |
572
+ | details | <ul><li>min: 3 tokens</li><li>mean: 63.14 tokens</li><li>max: 384 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 16.13 tokens</li><li>max: 49 tokens</li></ul> |
573
+ * Samples:
574
+ | positive | anchor |
575
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|
576
+ | <code>Aberrant patterns of H3K4, H3K9, and H3K27 histone lysine methylation were shown to result in histone code alterations, which induce changes in gene expression, and affect the proliferation rate of cells in medulloblastoma.</code> | <code>What is the implication of histone lysine methylation in medulloblastoma?</code> |
577
+ | <code>STAG1/STAG2 proteins are tumour suppressor proteins that suppress cell proliferation and are essential for differentiation.</code> | <code>What is the role of STAG1/STAG2 proteins in differentiation?</code> |
578
+ | <code>The association between cell phone use and incident glioblastoma remains unclear. Some studies have reported that cell phone use was associated with incident glioblastoma, and with reduced survival of patients diagnosed with glioblastoma. However, other studies have repeatedly replicated to find an association between cell phone use and glioblastoma.</code> | <code>What is the association between cell phone use and glioblastoma?</code> |
579
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
580
+ ```json
581
+ {
582
+ "loss": "MultipleNegativesRankingLoss",
583
+ "matryoshka_dims": [
584
+ 768,
585
+ 512,
586
+ 256,
587
+ 128,
588
+ 64
589
+ ],
590
+ "matryoshka_weights": [
591
+ 1,
592
+ 1,
593
+ 1,
594
+ 1,
595
+ 1
596
+ ],
597
+ "n_dims_per_step": -1
598
+ }
599
+ ```
600
+
601
+ ### Training Hyperparameters
602
+ #### Non-Default Hyperparameters
603
+
604
+ - `eval_strategy`: epoch
605
+ - `per_device_train_batch_size`: 32
606
+ - `per_device_eval_batch_size`: 16
607
+ - `gradient_accumulation_steps`: 16
608
+ - `learning_rate`: 2e-05
609
+ - `num_train_epochs`: 4
610
+ - `lr_scheduler_type`: cosine
611
+ - `warmup_ratio`: 0.1
612
+ - `bf16`: True
613
+ - `tf32`: True
614
+ - `load_best_model_at_end`: True
615
+ - `optim`: adamw_torch_fused
616
+ - `batch_sampler`: no_duplicates
617
+
618
+ #### All Hyperparameters
619
+ <details><summary>Click to expand</summary>
620
+
621
+ - `overwrite_output_dir`: False
622
+ - `do_predict`: False
623
+ - `eval_strategy`: epoch
624
+ - `prediction_loss_only`: True
625
+ - `per_device_train_batch_size`: 32
626
+ - `per_device_eval_batch_size`: 16
627
+ - `per_gpu_train_batch_size`: None
628
+ - `per_gpu_eval_batch_size`: None
629
+ - `gradient_accumulation_steps`: 16
630
+ - `eval_accumulation_steps`: None
631
+ - `learning_rate`: 2e-05
632
+ - `weight_decay`: 0.0
633
+ - `adam_beta1`: 0.9
634
+ - `adam_beta2`: 0.999
635
+ - `adam_epsilon`: 1e-08
636
+ - `max_grad_norm`: 1.0
637
+ - `num_train_epochs`: 4
638
+ - `max_steps`: -1
639
+ - `lr_scheduler_type`: cosine
640
+ - `lr_scheduler_kwargs`: {}
641
+ - `warmup_ratio`: 0.1
642
+ - `warmup_steps`: 0
643
+ - `log_level`: passive
644
+ - `log_level_replica`: warning
645
+ - `log_on_each_node`: True
646
+ - `logging_nan_inf_filter`: True
647
+ - `save_safetensors`: True
648
+ - `save_on_each_node`: False
649
+ - `save_only_model`: False
650
+ - `restore_callback_states_from_checkpoint`: False
651
+ - `no_cuda`: False
652
+ - `use_cpu`: False
653
+ - `use_mps_device`: False
654
+ - `seed`: 42
655
+ - `data_seed`: None
656
+ - `jit_mode_eval`: False
657
+ - `use_ipex`: False
658
+ - `bf16`: True
659
+ - `fp16`: False
660
+ - `fp16_opt_level`: O1
661
+ - `half_precision_backend`: auto
662
+ - `bf16_full_eval`: False
663
+ - `fp16_full_eval`: False
664
+ - `tf32`: True
665
+ - `local_rank`: 0
666
+ - `ddp_backend`: None
667
+ - `tpu_num_cores`: None
668
+ - `tpu_metrics_debug`: False
669
+ - `debug`: []
670
+ - `dataloader_drop_last`: False
671
+ - `dataloader_num_workers`: 0
672
+ - `dataloader_prefetch_factor`: None
673
+ - `past_index`: -1
674
+ - `disable_tqdm`: False
675
+ - `remove_unused_columns`: True
676
+ - `label_names`: None
677
+ - `load_best_model_at_end`: True
678
+ - `ignore_data_skip`: False
679
+ - `fsdp`: []
680
+ - `fsdp_min_num_params`: 0
681
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
682
+ - `fsdp_transformer_layer_cls_to_wrap`: None
683
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
684
+ - `deepspeed`: None
685
+ - `label_smoothing_factor`: 0.0
686
+ - `optim`: adamw_torch_fused
687
+ - `optim_args`: None
688
+ - `adafactor`: False
689
+ - `group_by_length`: False
690
+ - `length_column_name`: length
691
+ - `ddp_find_unused_parameters`: None
692
+ - `ddp_bucket_cap_mb`: None
693
+ - `ddp_broadcast_buffers`: False
694
+ - `dataloader_pin_memory`: True
695
+ - `dataloader_persistent_workers`: False
696
+ - `skip_memory_metrics`: True
697
+ - `use_legacy_prediction_loop`: False
698
+ - `push_to_hub`: False
699
+ - `resume_from_checkpoint`: None
700
+ - `hub_model_id`: None
701
+ - `hub_strategy`: every_save
702
+ - `hub_private_repo`: False
703
+ - `hub_always_push`: False
704
+ - `gradient_checkpointing`: False
705
+ - `gradient_checkpointing_kwargs`: None
706
+ - `include_inputs_for_metrics`: False
707
+ - `eval_do_concat_batches`: True
708
+ - `fp16_backend`: auto
709
+ - `push_to_hub_model_id`: None
710
+ - `push_to_hub_organization`: None
711
+ - `mp_parameters`:
712
+ - `auto_find_batch_size`: False
713
+ - `full_determinism`: False
714
+ - `torchdynamo`: None
715
+ - `ray_scope`: last
716
+ - `ddp_timeout`: 1800
717
+ - `torch_compile`: False
718
+ - `torch_compile_backend`: None
719
+ - `torch_compile_mode`: None
720
+ - `dispatch_batches`: None
721
+ - `split_batches`: None
722
+ - `include_tokens_per_second`: False
723
+ - `include_num_input_tokens_seen`: False
724
+ - `neftune_noise_alpha`: None
725
+ - `optim_target_modules`: None
726
+ - `batch_eval_metrics`: False
727
+ - `batch_sampler`: no_duplicates
728
+ - `multi_dataset_batch_sampler`: proportional
729
+
730
+ </details>
731
+
732
+ ### Training Logs
733
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
734
+ |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
735
+ | 0.8889 | 7 | - | 0.8540 | 0.8752 | 0.8825 | 0.8050 | 0.8864 |
736
+ | 1.2698 | 10 | 1.2032 | - | - | - | - | - |
737
+ | 1.9048 | 15 | - | 0.8569 | 0.8775 | 0.8850 | 0.8169 | 0.8840 |
738
+ | 2.5397 | 20 | 0.5051 | - | - | - | - | - |
739
+ | **2.9206** | **23** | **-** | **0.861** | **0.8794** | **0.8866** | **0.8242** | **0.8858** |
740
+ | 3.5556 | 28 | - | 0.8609 | 0.8787 | 0.8871 | 0.8224 | 0.8868 |
741
+
742
+ * The bold row denotes the saved checkpoint.
743
+
744
+ ### Framework Versions
745
+ - Python: 3.11.5
746
+ - Sentence Transformers: 3.0.1
747
+ - Transformers: 4.41.2
748
+ - PyTorch: 2.1.2+cu121
749
+ - Accelerate: 0.31.0
750
+ - Datasets: 2.19.1
751
+ - Tokenizers: 0.19.1
752
+
753
+ ## Citation
754
+
755
+ ### BibTeX
756
+
757
+ #### Sentence Transformers
758
+ ```bibtex
759
+ @inproceedings{reimers-2019-sentence-bert,
760
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
761
+ author = "Reimers, Nils and Gurevych, Iryna",
762
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
763
+ month = "11",
764
+ year = "2019",
765
+ publisher = "Association for Computational Linguistics",
766
+ url = "https://arxiv.org/abs/1908.10084",
767
+ }
768
+ ```
769
+
770
+ #### MatryoshkaLoss
771
+ ```bibtex
772
+ @misc{kusupati2024matryoshka,
773
+ title={Matryoshka Representation Learning},
774
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
775
+ year={2024},
776
+ eprint={2205.13147},
777
+ archivePrefix={arXiv},
778
+ primaryClass={cs.LG}
779
+ }
780
+ ```
781
+
782
+ #### MultipleNegativesRankingLoss
783
+ ```bibtex
784
+ @misc{henderson2017efficient,
785
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
786
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
787
+ year={2017},
788
+ eprint={1705.00652},
789
+ archivePrefix={arXiv},
790
+ primaryClass={cs.CL}
791
+ }
792
+ ```
793
+
794
+ <!--
795
+ ## Glossary
796
+
797
+ *Clearly define terms in order to be accessible across audiences.*
798
+ -->
799
+
800
+ <!--
801
+ ## Model Card Authors
802
+
803
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
804
+ -->
805
+
806
+ <!--
807
+ ## Model Card Contact
808
+
809
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
810
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/all-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.41.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0be4e2fccfccd6c364ee4800575351a8094282f2a4b817870acd029cf59d279e
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 384,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": true,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "mask_token": "<mask>",
58
+ "max_length": 128,
59
+ "model_max_length": 384,
60
+ "pad_to_multiple_of": null,
61
+ "pad_token": "<pad>",
62
+ "pad_token_type_id": 0,
63
+ "padding_side": "right",
64
+ "sep_token": "</s>",
65
+ "stride": 0,
66
+ "strip_accents": null,
67
+ "tokenize_chinese_chars": true,
68
+ "tokenizer_class": "MPNetTokenizer",
69
+ "truncation_side": "right",
70
+ "truncation_strategy": "longest_first",
71
+ "unk_token": "[UNK]"
72
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff