juanpablomesa commited on
Commit
183f975
1 Parent(s): ec348aa

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,810 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:4012
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: 'Extensive messenger RNA editing generates transcript and protein
35
+ diversity in genes involved in neural excitability, as previously described, as
36
+ well as in genes participating in a broad range of other cellular functions. '
37
+ sentences:
38
+ - Do cephalopods use RNA editing less frequently than other species?
39
+ - GV1001 vaccine targets which enzyme?
40
+ - Which event results in the acetylation of S6K1?
41
+ - source_sentence: Yes, exposure to household furry pets influences the gut microbiota
42
+ of infants.
43
+ sentences:
44
+ - Can pets affect infant microbiomed?
45
+ - What is the mode of action of Thiazovivin?
46
+ - What are the effects of CAMK4 inhibition?
47
+ - source_sentence: "In children with heart failure evidence of the effect of enalapril\
48
+ \ is empirical. Enalapril was clinically safe and effective in 50% to 80% of for\
49
+ \ children with cardiac failure secondary to congenital heart malformations before\
50
+ \ and after cardiac surgery, impaired ventricular function , valvar regurgitation,\
51
+ \ congestive cardiomyopathy, , arterial hypertension, life-threatening arrhythmias\
52
+ \ coexisting with circulatory insufficiency. \nACE inhibitors have shown a transient\
53
+ \ beneficial effect on heart failure due to anticancer drugs and possibly a beneficial\
54
+ \ effect in muscular dystrophy-associated cardiomyopathy, which deserves further\
55
+ \ studies."
56
+ sentences:
57
+ - Which receptors can be evaluated with the [18F]altanserin?
58
+ - In what proportion of children with heart failure has Enalapril been shown to
59
+ be safe and effective?
60
+ - Which major signaling pathways are regulated by RIP1?
61
+ - source_sentence: Cellular senescence-associated heterochromatic foci (SAHFS) are
62
+ a novel type of chromatin condensation involving alterations of linker histone
63
+ H1 and linker DNA-binding proteins. SAHFS can be formed by a variety of cell types,
64
+ but their mechanism of action remains unclear.
65
+ sentences:
66
+ - What is the relationship between the X chromosome and a neutrophil drumstick?
67
+ - Which microRNAs are involved in exercise adaptation?
68
+ - How are SAHFS created?
69
+ - source_sentence: Multicluster Pcdh diversity is required for mouse olfactory neural
70
+ circuit assembly. The vertebrate clustered protocadherin (Pcdh) cell surface proteins
71
+ are encoded by three closely linked gene clusters (Pcdhα, Pcdhβ, and Pcdhγ). Although
72
+ deletion of individual Pcdh clusters had subtle phenotypic consequences, the loss
73
+ of all three clusters (tricluster deletion) led to a severe axonal arborization
74
+ defect and loss of self-avoidance.
75
+ sentences:
76
+ - What are the effects of the deletion of all three Pcdh clusters (tricluster deletion)
77
+ in mice?
78
+ - what is the role of MEF-2 in cardiomyocyte differentiation?
79
+ - How many periods of regulatory innovation led to the evolution of vertebrates?
80
+ model-index:
81
+ - name: BGE base Financial Matryoshka
82
+ results:
83
+ - task:
84
+ type: information-retrieval
85
+ name: Information Retrieval
86
+ dataset:
87
+ name: dim 768
88
+ type: dim_768
89
+ metrics:
90
+ - type: cosine_accuracy@1
91
+ value: 0.8528995756718529
92
+ name: Cosine Accuracy@1
93
+ - type: cosine_accuracy@3
94
+ value: 0.9264497878359265
95
+ name: Cosine Accuracy@3
96
+ - type: cosine_accuracy@5
97
+ value: 0.9462517680339463
98
+ name: Cosine Accuracy@5
99
+ - type: cosine_accuracy@10
100
+ value: 0.958981612446959
101
+ name: Cosine Accuracy@10
102
+ - type: cosine_precision@1
103
+ value: 0.8528995756718529
104
+ name: Cosine Precision@1
105
+ - type: cosine_precision@3
106
+ value: 0.3088165959453088
107
+ name: Cosine Precision@3
108
+ - type: cosine_precision@5
109
+ value: 0.18925035360678924
110
+ name: Cosine Precision@5
111
+ - type: cosine_precision@10
112
+ value: 0.09589816124469587
113
+ name: Cosine Precision@10
114
+ - type: cosine_recall@1
115
+ value: 0.8528995756718529
116
+ name: Cosine Recall@1
117
+ - type: cosine_recall@3
118
+ value: 0.9264497878359265
119
+ name: Cosine Recall@3
120
+ - type: cosine_recall@5
121
+ value: 0.9462517680339463
122
+ name: Cosine Recall@5
123
+ - type: cosine_recall@10
124
+ value: 0.958981612446959
125
+ name: Cosine Recall@10
126
+ - type: cosine_ndcg@10
127
+ value: 0.9106149406529569
128
+ name: Cosine Ndcg@10
129
+ - type: cosine_mrr@10
130
+ value: 0.8946105835073304
131
+ name: Cosine Mrr@10
132
+ - type: cosine_map@100
133
+ value: 0.8959864574088351
134
+ name: Cosine Map@100
135
+ - task:
136
+ type: information-retrieval
137
+ name: Information Retrieval
138
+ dataset:
139
+ name: dim 512
140
+ type: dim_512
141
+ metrics:
142
+ - type: cosine_accuracy@1
143
+ value: 0.8472418670438473
144
+ name: Cosine Accuracy@1
145
+ - type: cosine_accuracy@3
146
+ value: 0.9321074964639321
147
+ name: Cosine Accuracy@3
148
+ - type: cosine_accuracy@5
149
+ value: 0.9476661951909476
150
+ name: Cosine Accuracy@5
151
+ - type: cosine_accuracy@10
152
+ value: 0.9603960396039604
153
+ name: Cosine Accuracy@10
154
+ - type: cosine_precision@1
155
+ value: 0.8472418670438473
156
+ name: Cosine Precision@1
157
+ - type: cosine_precision@3
158
+ value: 0.3107024988213107
159
+ name: Cosine Precision@3
160
+ - type: cosine_precision@5
161
+ value: 0.1895332390381895
162
+ name: Cosine Precision@5
163
+ - type: cosine_precision@10
164
+ value: 0.09603960396039603
165
+ name: Cosine Precision@10
166
+ - type: cosine_recall@1
167
+ value: 0.8472418670438473
168
+ name: Cosine Recall@1
169
+ - type: cosine_recall@3
170
+ value: 0.9321074964639321
171
+ name: Cosine Recall@3
172
+ - type: cosine_recall@5
173
+ value: 0.9476661951909476
174
+ name: Cosine Recall@5
175
+ - type: cosine_recall@10
176
+ value: 0.9603960396039604
177
+ name: Cosine Recall@10
178
+ - type: cosine_ndcg@10
179
+ value: 0.9095270940461391
180
+ name: Cosine Ndcg@10
181
+ - type: cosine_mrr@10
182
+ value: 0.8926230888394963
183
+ name: Cosine Mrr@10
184
+ - type: cosine_map@100
185
+ value: 0.8939142126576148
186
+ name: Cosine Map@100
187
+ - task:
188
+ type: information-retrieval
189
+ name: Information Retrieval
190
+ dataset:
191
+ name: dim 256
192
+ type: dim_256
193
+ metrics:
194
+ - type: cosine_accuracy@1
195
+ value: 0.8359264497878359
196
+ name: Cosine Accuracy@1
197
+ - type: cosine_accuracy@3
198
+ value: 0.925035360678925
199
+ name: Cosine Accuracy@3
200
+ - type: cosine_accuracy@5
201
+ value: 0.9405940594059405
202
+ name: Cosine Accuracy@5
203
+ - type: cosine_accuracy@10
204
+ value: 0.9533239038189534
205
+ name: Cosine Accuracy@10
206
+ - type: cosine_precision@1
207
+ value: 0.8359264497878359
208
+ name: Cosine Precision@1
209
+ - type: cosine_precision@3
210
+ value: 0.30834512022630833
211
+ name: Cosine Precision@3
212
+ - type: cosine_precision@5
213
+ value: 0.1881188118811881
214
+ name: Cosine Precision@5
215
+ - type: cosine_precision@10
216
+ value: 0.09533239038189532
217
+ name: Cosine Precision@10
218
+ - type: cosine_recall@1
219
+ value: 0.8359264497878359
220
+ name: Cosine Recall@1
221
+ - type: cosine_recall@3
222
+ value: 0.925035360678925
223
+ name: Cosine Recall@3
224
+ - type: cosine_recall@5
225
+ value: 0.9405940594059405
226
+ name: Cosine Recall@5
227
+ - type: cosine_recall@10
228
+ value: 0.9533239038189534
229
+ name: Cosine Recall@10
230
+ - type: cosine_ndcg@10
231
+ value: 0.9003866854175698
232
+ name: Cosine Ndcg@10
233
+ - type: cosine_mrr@10
234
+ value: 0.8828006780269864
235
+ name: Cosine Mrr@10
236
+ - type: cosine_map@100
237
+ value: 0.8839707936250328
238
+ name: Cosine Map@100
239
+ - task:
240
+ type: information-retrieval
241
+ name: Information Retrieval
242
+ dataset:
243
+ name: dim 128
244
+ type: dim_128
245
+ metrics:
246
+ - type: cosine_accuracy@1
247
+ value: 0.8175388967468176
248
+ name: Cosine Accuracy@1
249
+ - type: cosine_accuracy@3
250
+ value: 0.9108910891089109
251
+ name: Cosine Accuracy@3
252
+ - type: cosine_accuracy@5
253
+ value: 0.9264497878359265
254
+ name: Cosine Accuracy@5
255
+ - type: cosine_accuracy@10
256
+ value: 0.9434229137199435
257
+ name: Cosine Accuracy@10
258
+ - type: cosine_precision@1
259
+ value: 0.8175388967468176
260
+ name: Cosine Precision@1
261
+ - type: cosine_precision@3
262
+ value: 0.30363036303630364
263
+ name: Cosine Precision@3
264
+ - type: cosine_precision@5
265
+ value: 0.18528995756718525
266
+ name: Cosine Precision@5
267
+ - type: cosine_precision@10
268
+ value: 0.09434229137199433
269
+ name: Cosine Precision@10
270
+ - type: cosine_recall@1
271
+ value: 0.8175388967468176
272
+ name: Cosine Recall@1
273
+ - type: cosine_recall@3
274
+ value: 0.9108910891089109
275
+ name: Cosine Recall@3
276
+ - type: cosine_recall@5
277
+ value: 0.9264497878359265
278
+ name: Cosine Recall@5
279
+ - type: cosine_recall@10
280
+ value: 0.9434229137199435
281
+ name: Cosine Recall@10
282
+ - type: cosine_ndcg@10
283
+ value: 0.8862907631297875
284
+ name: Cosine Ndcg@10
285
+ - type: cosine_mrr@10
286
+ value: 0.8674047506791496
287
+ name: Cosine Mrr@10
288
+ - type: cosine_map@100
289
+ value: 0.8686719824449951
290
+ name: Cosine Map@100
291
+ - task:
292
+ type: information-retrieval
293
+ name: Information Retrieval
294
+ dataset:
295
+ name: dim 64
296
+ type: dim_64
297
+ metrics:
298
+ - type: cosine_accuracy@1
299
+ value: 0.7779349363507779
300
+ name: Cosine Accuracy@1
301
+ - type: cosine_accuracy@3
302
+ value: 0.8868458274398868
303
+ name: Cosine Accuracy@3
304
+ - type: cosine_accuracy@5
305
+ value: 0.9066478076379066
306
+ name: Cosine Accuracy@5
307
+ - type: cosine_accuracy@10
308
+ value: 0.9207920792079208
309
+ name: Cosine Accuracy@10
310
+ - type: cosine_precision@1
311
+ value: 0.7779349363507779
312
+ name: Cosine Precision@1
313
+ - type: cosine_precision@3
314
+ value: 0.2956152758132956
315
+ name: Cosine Precision@3
316
+ - type: cosine_precision@5
317
+ value: 0.1813295615275813
318
+ name: Cosine Precision@5
319
+ - type: cosine_precision@10
320
+ value: 0.09207920792079208
321
+ name: Cosine Precision@10
322
+ - type: cosine_recall@1
323
+ value: 0.7779349363507779
324
+ name: Cosine Recall@1
325
+ - type: cosine_recall@3
326
+ value: 0.8868458274398868
327
+ name: Cosine Recall@3
328
+ - type: cosine_recall@5
329
+ value: 0.9066478076379066
330
+ name: Cosine Recall@5
331
+ - type: cosine_recall@10
332
+ value: 0.9207920792079208
333
+ name: Cosine Recall@10
334
+ - type: cosine_ndcg@10
335
+ value: 0.8570476590886804
336
+ name: Cosine Ndcg@10
337
+ - type: cosine_mrr@10
338
+ value: 0.835792303720168
339
+ name: Cosine Mrr@10
340
+ - type: cosine_map@100
341
+ value: 0.8374166888522218
342
+ name: Cosine Map@100
343
+ ---
344
+
345
+ # BGE base Financial Matryoshka
346
+
347
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
348
+
349
+ ## Model Details
350
+
351
+ ### Model Description
352
+ - **Model Type:** Sentence Transformer
353
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
354
+ - **Maximum Sequence Length:** 512 tokens
355
+ - **Output Dimensionality:** 768 tokens
356
+ - **Similarity Function:** Cosine Similarity
357
+ <!-- - **Training Dataset:** Unknown -->
358
+ - **Language:** en
359
+ - **License:** apache-2.0
360
+
361
+ ### Model Sources
362
+
363
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
364
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
365
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
366
+
367
+ ### Full Model Architecture
368
+
369
+ ```
370
+ SentenceTransformer(
371
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
372
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
373
+ (2): Normalize()
374
+ )
375
+ ```
376
+
377
+ ## Usage
378
+
379
+ ### Direct Usage (Sentence Transformers)
380
+
381
+ First install the Sentence Transformers library:
382
+
383
+ ```bash
384
+ pip install -U sentence-transformers
385
+ ```
386
+
387
+ Then you can load this model and run inference.
388
+ ```python
389
+ from sentence_transformers import SentenceTransformer
390
+
391
+ # Download from the 🤗 Hub
392
+ model = SentenceTransformer("juanpablomesa/bge-base-bioasq-matryoshka")
393
+ # Run inference
394
+ sentences = [
395
+ 'Multicluster Pcdh diversity is required for mouse olfactory neural circuit assembly. The vertebrate clustered protocadherin (Pcdh) cell surface proteins are encoded by three closely linked gene clusters (Pcdhα, Pcdhβ, and Pcdhγ). Although deletion of individual Pcdh clusters had subtle phenotypic consequences, the loss of all three clusters (tricluster deletion) led to a severe axonal arborization defect and loss of self-avoidance.',
396
+ 'What are the effects of the deletion of all three Pcdh clusters (tricluster deletion) in mice?',
397
+ 'How many periods of regulatory innovation led to the evolution of vertebrates?',
398
+ ]
399
+ embeddings = model.encode(sentences)
400
+ print(embeddings.shape)
401
+ # [3, 768]
402
+
403
+ # Get the similarity scores for the embeddings
404
+ similarities = model.similarity(embeddings, embeddings)
405
+ print(similarities.shape)
406
+ # [3, 3]
407
+ ```
408
+
409
+ <!--
410
+ ### Direct Usage (Transformers)
411
+
412
+ <details><summary>Click to see the direct usage in Transformers</summary>
413
+
414
+ </details>
415
+ -->
416
+
417
+ <!--
418
+ ### Downstream Usage (Sentence Transformers)
419
+
420
+ You can finetune this model on your own dataset.
421
+
422
+ <details><summary>Click to expand</summary>
423
+
424
+ </details>
425
+ -->
426
+
427
+ <!--
428
+ ### Out-of-Scope Use
429
+
430
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
431
+ -->
432
+
433
+ ## Evaluation
434
+
435
+ ### Metrics
436
+
437
+ #### Information Retrieval
438
+ * Dataset: `dim_768`
439
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
440
+
441
+ | Metric | Value |
442
+ |:--------------------|:----------|
443
+ | cosine_accuracy@1 | 0.8529 |
444
+ | cosine_accuracy@3 | 0.9264 |
445
+ | cosine_accuracy@5 | 0.9463 |
446
+ | cosine_accuracy@10 | 0.959 |
447
+ | cosine_precision@1 | 0.8529 |
448
+ | cosine_precision@3 | 0.3088 |
449
+ | cosine_precision@5 | 0.1893 |
450
+ | cosine_precision@10 | 0.0959 |
451
+ | cosine_recall@1 | 0.8529 |
452
+ | cosine_recall@3 | 0.9264 |
453
+ | cosine_recall@5 | 0.9463 |
454
+ | cosine_recall@10 | 0.959 |
455
+ | cosine_ndcg@10 | 0.9106 |
456
+ | cosine_mrr@10 | 0.8946 |
457
+ | **cosine_map@100** | **0.896** |
458
+
459
+ #### Information Retrieval
460
+ * Dataset: `dim_512`
461
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
462
+
463
+ | Metric | Value |
464
+ |:--------------------|:-----------|
465
+ | cosine_accuracy@1 | 0.8472 |
466
+ | cosine_accuracy@3 | 0.9321 |
467
+ | cosine_accuracy@5 | 0.9477 |
468
+ | cosine_accuracy@10 | 0.9604 |
469
+ | cosine_precision@1 | 0.8472 |
470
+ | cosine_precision@3 | 0.3107 |
471
+ | cosine_precision@5 | 0.1895 |
472
+ | cosine_precision@10 | 0.096 |
473
+ | cosine_recall@1 | 0.8472 |
474
+ | cosine_recall@3 | 0.9321 |
475
+ | cosine_recall@5 | 0.9477 |
476
+ | cosine_recall@10 | 0.9604 |
477
+ | cosine_ndcg@10 | 0.9095 |
478
+ | cosine_mrr@10 | 0.8926 |
479
+ | **cosine_map@100** | **0.8939** |
480
+
481
+ #### Information Retrieval
482
+ * Dataset: `dim_256`
483
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
484
+
485
+ | Metric | Value |
486
+ |:--------------------|:----------|
487
+ | cosine_accuracy@1 | 0.8359 |
488
+ | cosine_accuracy@3 | 0.925 |
489
+ | cosine_accuracy@5 | 0.9406 |
490
+ | cosine_accuracy@10 | 0.9533 |
491
+ | cosine_precision@1 | 0.8359 |
492
+ | cosine_precision@3 | 0.3083 |
493
+ | cosine_precision@5 | 0.1881 |
494
+ | cosine_precision@10 | 0.0953 |
495
+ | cosine_recall@1 | 0.8359 |
496
+ | cosine_recall@3 | 0.925 |
497
+ | cosine_recall@5 | 0.9406 |
498
+ | cosine_recall@10 | 0.9533 |
499
+ | cosine_ndcg@10 | 0.9004 |
500
+ | cosine_mrr@10 | 0.8828 |
501
+ | **cosine_map@100** | **0.884** |
502
+
503
+ #### Information Retrieval
504
+ * Dataset: `dim_128`
505
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
506
+
507
+ | Metric | Value |
508
+ |:--------------------|:-----------|
509
+ | cosine_accuracy@1 | 0.8175 |
510
+ | cosine_accuracy@3 | 0.9109 |
511
+ | cosine_accuracy@5 | 0.9264 |
512
+ | cosine_accuracy@10 | 0.9434 |
513
+ | cosine_precision@1 | 0.8175 |
514
+ | cosine_precision@3 | 0.3036 |
515
+ | cosine_precision@5 | 0.1853 |
516
+ | cosine_precision@10 | 0.0943 |
517
+ | cosine_recall@1 | 0.8175 |
518
+ | cosine_recall@3 | 0.9109 |
519
+ | cosine_recall@5 | 0.9264 |
520
+ | cosine_recall@10 | 0.9434 |
521
+ | cosine_ndcg@10 | 0.8863 |
522
+ | cosine_mrr@10 | 0.8674 |
523
+ | **cosine_map@100** | **0.8687** |
524
+
525
+ #### Information Retrieval
526
+ * Dataset: `dim_64`
527
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
528
+
529
+ | Metric | Value |
530
+ |:--------------------|:-----------|
531
+ | cosine_accuracy@1 | 0.7779 |
532
+ | cosine_accuracy@3 | 0.8868 |
533
+ | cosine_accuracy@5 | 0.9066 |
534
+ | cosine_accuracy@10 | 0.9208 |
535
+ | cosine_precision@1 | 0.7779 |
536
+ | cosine_precision@3 | 0.2956 |
537
+ | cosine_precision@5 | 0.1813 |
538
+ | cosine_precision@10 | 0.0921 |
539
+ | cosine_recall@1 | 0.7779 |
540
+ | cosine_recall@3 | 0.8868 |
541
+ | cosine_recall@5 | 0.9066 |
542
+ | cosine_recall@10 | 0.9208 |
543
+ | cosine_ndcg@10 | 0.857 |
544
+ | cosine_mrr@10 | 0.8358 |
545
+ | **cosine_map@100** | **0.8374** |
546
+
547
+ <!--
548
+ ## Bias, Risks and Limitations
549
+
550
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
551
+ -->
552
+
553
+ <!--
554
+ ### Recommendations
555
+
556
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
557
+ -->
558
+
559
+ ## Training Details
560
+
561
+ ### Training Dataset
562
+
563
+ #### Unnamed Dataset
564
+
565
+
566
+ * Size: 4,012 training samples
567
+ * Columns: <code>positive</code> and <code>anchor</code>
568
+ * Approximate statistics based on the first 1000 samples:
569
+ | | positive | anchor |
570
+ |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
571
+ | type | string | string |
572
+ | details | <ul><li>min: 3 tokens</li><li>mean: 63.38 tokens</li><li>max: 485 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 16.13 tokens</li><li>max: 49 tokens</li></ul> |
573
+ * Samples:
574
+ | positive | anchor |
575
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|
576
+ | <code>Aberrant patterns of H3K4, H3K9, and H3K27 histone lysine methylation were shown to result in histone code alterations, which induce changes in gene expression, and affect the proliferation rate of cells in medulloblastoma.</code> | <code>What is the implication of histone lysine methylation in medulloblastoma?</code> |
577
+ | <code>STAG1/STAG2 proteins are tumour suppressor proteins that suppress cell proliferation and are essential for differentiation.</code> | <code>What is the role of STAG1/STAG2 proteins in differentiation?</code> |
578
+ | <code>The association between cell phone use and incident glioblastoma remains unclear. Some studies have reported that cell phone use was associated with incident glioblastoma, and with reduced survival of patients diagnosed with glioblastoma. However, other studies have repeatedly replicated to find an association between cell phone use and glioblastoma.</code> | <code>What is the association between cell phone use and glioblastoma?</code> |
579
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
580
+ ```json
581
+ {
582
+ "loss": "MultipleNegativesRankingLoss",
583
+ "matryoshka_dims": [
584
+ 768,
585
+ 512,
586
+ 256,
587
+ 128,
588
+ 64
589
+ ],
590
+ "matryoshka_weights": [
591
+ 1,
592
+ 1,
593
+ 1,
594
+ 1,
595
+ 1
596
+ ],
597
+ "n_dims_per_step": -1
598
+ }
599
+ ```
600
+
601
+ ### Training Hyperparameters
602
+ #### Non-Default Hyperparameters
603
+
604
+ - `eval_strategy`: epoch
605
+ - `per_device_train_batch_size`: 32
606
+ - `per_device_eval_batch_size`: 16
607
+ - `gradient_accumulation_steps`: 16
608
+ - `learning_rate`: 2e-05
609
+ - `num_train_epochs`: 4
610
+ - `lr_scheduler_type`: cosine
611
+ - `warmup_ratio`: 0.1
612
+ - `bf16`: True
613
+ - `tf32`: True
614
+ - `load_best_model_at_end`: True
615
+ - `optim`: adamw_torch_fused
616
+ - `batch_sampler`: no_duplicates
617
+
618
+ #### All Hyperparameters
619
+ <details><summary>Click to expand</summary>
620
+
621
+ - `overwrite_output_dir`: False
622
+ - `do_predict`: False
623
+ - `eval_strategy`: epoch
624
+ - `prediction_loss_only`: True
625
+ - `per_device_train_batch_size`: 32
626
+ - `per_device_eval_batch_size`: 16
627
+ - `per_gpu_train_batch_size`: None
628
+ - `per_gpu_eval_batch_size`: None
629
+ - `gradient_accumulation_steps`: 16
630
+ - `eval_accumulation_steps`: None
631
+ - `learning_rate`: 2e-05
632
+ - `weight_decay`: 0.0
633
+ - `adam_beta1`: 0.9
634
+ - `adam_beta2`: 0.999
635
+ - `adam_epsilon`: 1e-08
636
+ - `max_grad_norm`: 1.0
637
+ - `num_train_epochs`: 4
638
+ - `max_steps`: -1
639
+ - `lr_scheduler_type`: cosine
640
+ - `lr_scheduler_kwargs`: {}
641
+ - `warmup_ratio`: 0.1
642
+ - `warmup_steps`: 0
643
+ - `log_level`: passive
644
+ - `log_level_replica`: warning
645
+ - `log_on_each_node`: True
646
+ - `logging_nan_inf_filter`: True
647
+ - `save_safetensors`: True
648
+ - `save_on_each_node`: False
649
+ - `save_only_model`: False
650
+ - `restore_callback_states_from_checkpoint`: False
651
+ - `no_cuda`: False
652
+ - `use_cpu`: False
653
+ - `use_mps_device`: False
654
+ - `seed`: 42
655
+ - `data_seed`: None
656
+ - `jit_mode_eval`: False
657
+ - `use_ipex`: False
658
+ - `bf16`: True
659
+ - `fp16`: False
660
+ - `fp16_opt_level`: O1
661
+ - `half_precision_backend`: auto
662
+ - `bf16_full_eval`: False
663
+ - `fp16_full_eval`: False
664
+ - `tf32`: True
665
+ - `local_rank`: 0
666
+ - `ddp_backend`: None
667
+ - `tpu_num_cores`: None
668
+ - `tpu_metrics_debug`: False
669
+ - `debug`: []
670
+ - `dataloader_drop_last`: False
671
+ - `dataloader_num_workers`: 0
672
+ - `dataloader_prefetch_factor`: None
673
+ - `past_index`: -1
674
+ - `disable_tqdm`: False
675
+ - `remove_unused_columns`: True
676
+ - `label_names`: None
677
+ - `load_best_model_at_end`: True
678
+ - `ignore_data_skip`: False
679
+ - `fsdp`: []
680
+ - `fsdp_min_num_params`: 0
681
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
682
+ - `fsdp_transformer_layer_cls_to_wrap`: None
683
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
684
+ - `deepspeed`: None
685
+ - `label_smoothing_factor`: 0.0
686
+ - `optim`: adamw_torch_fused
687
+ - `optim_args`: None
688
+ - `adafactor`: False
689
+ - `group_by_length`: False
690
+ - `length_column_name`: length
691
+ - `ddp_find_unused_parameters`: None
692
+ - `ddp_bucket_cap_mb`: None
693
+ - `ddp_broadcast_buffers`: False
694
+ - `dataloader_pin_memory`: True
695
+ - `dataloader_persistent_workers`: False
696
+ - `skip_memory_metrics`: True
697
+ - `use_legacy_prediction_loop`: False
698
+ - `push_to_hub`: False
699
+ - `resume_from_checkpoint`: None
700
+ - `hub_model_id`: None
701
+ - `hub_strategy`: every_save
702
+ - `hub_private_repo`: False
703
+ - `hub_always_push`: False
704
+ - `gradient_checkpointing`: False
705
+ - `gradient_checkpointing_kwargs`: None
706
+ - `include_inputs_for_metrics`: False
707
+ - `eval_do_concat_batches`: True
708
+ - `fp16_backend`: auto
709
+ - `push_to_hub_model_id`: None
710
+ - `push_to_hub_organization`: None
711
+ - `mp_parameters`:
712
+ - `auto_find_batch_size`: False
713
+ - `full_determinism`: False
714
+ - `torchdynamo`: None
715
+ - `ray_scope`: last
716
+ - `ddp_timeout`: 1800
717
+ - `torch_compile`: False
718
+ - `torch_compile_backend`: None
719
+ - `torch_compile_mode`: None
720
+ - `dispatch_batches`: None
721
+ - `split_batches`: None
722
+ - `include_tokens_per_second`: False
723
+ - `include_num_input_tokens_seen`: False
724
+ - `neftune_noise_alpha`: None
725
+ - `optim_target_modules`: None
726
+ - `batch_eval_metrics`: False
727
+ - `batch_sampler`: no_duplicates
728
+ - `multi_dataset_batch_sampler`: proportional
729
+
730
+ </details>
731
+
732
+ ### Training Logs
733
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
734
+ |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
735
+ | 0.8889 | 7 | - | 0.8674 | 0.8951 | 0.8991 | 0.8236 | 0.8996 |
736
+ | 1.2698 | 10 | 1.6285 | - | - | - | - | - |
737
+ | 1.9048 | 15 | - | 0.8662 | 0.8849 | 0.8951 | 0.8334 | 0.8945 |
738
+ | 2.5397 | 20 | 0.7273 | - | - | - | - | - |
739
+ | 2.9206 | 23 | - | 0.8681 | 0.8849 | 0.8946 | 0.8362 | 0.8967 |
740
+ | **3.5556** | **28** | **-** | **0.8687** | **0.884** | **0.8939** | **0.8374** | **0.896** |
741
+
742
+ * The bold row denotes the saved checkpoint.
743
+
744
+ ### Framework Versions
745
+ - Python: 3.11.5
746
+ - Sentence Transformers: 3.0.1
747
+ - Transformers: 4.41.2
748
+ - PyTorch: 2.1.2+cu121
749
+ - Accelerate: 0.31.0
750
+ - Datasets: 2.19.1
751
+ - Tokenizers: 0.19.1
752
+
753
+ ## Citation
754
+
755
+ ### BibTeX
756
+
757
+ #### Sentence Transformers
758
+ ```bibtex
759
+ @inproceedings{reimers-2019-sentence-bert,
760
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
761
+ author = "Reimers, Nils and Gurevych, Iryna",
762
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
763
+ month = "11",
764
+ year = "2019",
765
+ publisher = "Association for Computational Linguistics",
766
+ url = "https://arxiv.org/abs/1908.10084",
767
+ }
768
+ ```
769
+
770
+ #### MatryoshkaLoss
771
+ ```bibtex
772
+ @misc{kusupati2024matryoshka,
773
+ title={Matryoshka Representation Learning},
774
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
775
+ year={2024},
776
+ eprint={2205.13147},
777
+ archivePrefix={arXiv},
778
+ primaryClass={cs.LG}
779
+ }
780
+ ```
781
+
782
+ #### MultipleNegativesRankingLoss
783
+ ```bibtex
784
+ @misc{henderson2017efficient,
785
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
786
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
787
+ year={2017},
788
+ eprint={1705.00652},
789
+ archivePrefix={arXiv},
790
+ primaryClass={cs.CL}
791
+ }
792
+ ```
793
+
794
+ <!--
795
+ ## Glossary
796
+
797
+ *Clearly define terms in order to be accessible across audiences.*
798
+ -->
799
+
800
+ <!--
801
+ ## Model Card Authors
802
+
803
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
804
+ -->
805
+
806
+ <!--
807
+ ## Model Card Contact
808
+
809
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
810
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.41.2",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8db46585c6952465fde4e094be39d2c53ceb761138455a17e09105336b9085fb
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff