File size: 34,935 Bytes
2db6871
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
---
base_model: dbourget/pb-ds1-48K
datasets: []
language: []
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:106810
- loss:CosineSimilarityLoss
widget:
- source_sentence: In  The Law of Civilization and Decay,  Brooks provides a detailed
    look at the rise and fall of civilizations, offering a critical perspective on
    the impact of capitalism. As societies become prosperous, their pursuit of wealth
    ultimately leads to their own downfall as greed takes over.
  sentences:
  - Patrick Todd's The Open Future argues that all future contingent statements, such
    as 'It will rain tomorrow', are inherently false.
  - If propositions are made true in virtue of corresponding to facts, then what are
    the truth-makers of true negative propositions such as ‘The apple is not red’?
    Russell argued that there must be negative facts to account for what makes true
    negative propositions true and false positive propositions false. Others, more
    parsimonious in their ontological commitments, have attempted to avoid them. Wittgenstein
    rejected them since he was loath to think that the sign for negation referred
    to a negative element in a fact. A contemporary of Russell’s, Raphael Demos, attempted
    to eliminate them by appealing to ‘incompatibility’ facts. More recently, Armstrong
    has appealed to the totality of positive facts as the ground of the truth of true
    negative propositions. Oaklander and Miracchi have suggested that the absence
    or non-existence of the positive fact (which is not itself a further fact) is
    the basis of a positive proposition being false and therefore of the truth of
    its negation.
  - The Law of Civilization and Decay is an overview of history, articulating Brooks'
    critical view of capitalism. A civilization grows wealthy, and then its wealth
    causes it to crumble upon itself due to greed.
- source_sentence: It is generally accepted that the development of the modern sciences
    is rooted in experiment. Yet for a long time, experimentation did not occupy a
    prominent role, neither in philosophy nor in history of science. With the ‘practical
    turn’ in studying the sciences and their history, this has begun to change. This
    paper is concerned with systems and cultures of experimentation and the consistencies
    that are generated within such systems and cultures. The first part of the paper
    exposes the forms of historical and structural coherence that characterize the
    experimental exploration of epistemic objects. In the second part, a particular
    experimental culture in the life sciences is briefly described as an example.
    A survey will be given of what it means and what it takes to analyze biological
    functions in the test tube
  sentences:
  - Experimentation has long been overlooked in the study of science, but with a new
    focus on practical aspects, this is starting to change. This paper explores the
    systems and cultures of experimentation and the patterns that emerge within them.
    The first part discusses the historical and structural coherence of experimental
    exploration. The second part provides a brief overview of an experimental culture
    in the life sciences. The paper concludes with a discussion on analyzing biological
    functions in the test tube.
  - Hintikka and Mutanen have introduced Trail-And-Error machines as a new way to
    think about computation, expanding on the traditional Turing machine model. This
    innovation opens up new possibilities in the field of computation theory.
  - As Allaire and Firsirotu (1984) pointed out over a decade ago, the concept of
    culture seemed to be sliding inexorably into a superficial explanatory pool that
    promised everything and nothing. However, since then, some sophisticated and interesting
    theoretical developments have prevented drowning in the pool of superficiality
    and hence theoretical redundancy. The purpose of this article is to build upon
    such theoretical developments and to introduce an approach that maintains that
    culture can be theorized in the same way as structure, possessing irreducible
    powers and properties that predispose organizational actors towards specific courses
    of action. The morphogenetic approach is the methodological complement of transcendental
    realism, providing explanatory leverage on the conditions that maintain for cultural
    change or stability.
- source_sentence: 'This chapter examines three approaches to applied political and
    legal philosophy: Standard activism is primarily addressed to other philosophers,
    adopts an indirect and coincidental role in creating change, and counts articulating
    sound arguments as success. Extreme activism, in contrast, is a form of applied
    philosophy directly addressed to policy-makers, with the goal of bringing about
    a particular outcome, and measures success in terms of whether it makes a direct
    causal contribution to that goal. Finally, conceptual activism (like standard
    activism), primarily targets an audience of fellow philosophers, bears a distant,
    non-direct, relation to a desired outcome, and counts success in terms of whether
    it encourages a particular understanding and adoption of the concepts under examination.'
  sentences:
  - John Rawls’ resistance to any kind of global egalitarian principle has seemed
    strange and unconvincing to many commentators, including those generally supportive
    of Rawls’ project. His rejection of a global egalitarian principle seems to rely
    on an assumption that states are economically bounded and separate from one another,
    which is not an accurate portrayal of economic relations among states in our globalised
    world. In this article, I examine the implications of the domestic theory of justice
    as fairness to argue that Rawls has good reason to insist on economically bounded
    states. I argue that certain central features of the contemporary global economy,
    particularly the free movement of capital across borders, undermine the distributional
    autonomy required for states to realise Rawls’ principles of justice, and the
    domestic theory thus requires a certain degree of economic separation among states
    prior to the convening of the international original position. Given this, I defend
    Rawls’ reluctance to endorse a global egalitarian principle and defend a policy
    regime of international capital controls, to restore distributional autonomy and
    make the realisation of the principles of justice as fairness possible.
  - 'Bibliography of the writings by Hilary Putnam: 16 books, 198 articles, 10 translations
    into German (up to 1994).'
  - The jurisprudence under international human rights treaties has had a considerable
    impact across countries. Known for addressing complex agendas, the work of expert
    bodies under the treaties has been credited and relied upon for filling the gaps
    in the realization of several objectives, including the peace and security agenda.  In
    1982, the Human Rights Committee (ICCPR), in a General Comment observed that “states
    have the supreme duty to prevent wars, acts of genocide and other acts of mass
    violence ... Every effort  to avert the danger of war, especially thermonuclear
    war, and to strengthen international peace and security would constitute the most
    important condition and guarantee for the safeguarding of the right to life.”
    Over the years, all treaty bodies have contributed in this direction, endorsing
    peace and security so as “to protect people against direct and structural violence
     as systemic problems and not merely as isolated incidents …”. A closer look
    at the jurisprudence on peace and security, emanating from treaty monitoring mechanisms
    including state periodic reports, interpretive statements, the individual communications
    procedure, and others, reveals its distinctive nature
- source_sentence: Autonomist accounts of cognitive science suggest that cognitive
    model building and theory construction (can or should) proceed independently of
    findings in neuroscience. Common functionalist justifications of autonomy rely
    on there being relatively few constraints between neural structure and cognitive
    function (e.g., Weiskopf, 2011). In contrast, an integrative mechanistic perspective
    stresses the mutual constraining of structure and function (e.g., Piccinini &
    Craver, 2011; Povich, 2015). In this paper, I show how model-based cognitive neuroscience
    (MBCN) epitomizes the integrative mechanistic perspective and concentrates the
    most revolutionary elements of the cognitive neuroscience revolution (Boone &
    Piccinini, 2016). I also show how the prominent subset account of functional realization
    supports the integrative mechanistic perspective I take on MBCN and use it to
    clarify the intralevel and interlevel components of integration.
  sentences:
  - Fictional truth, or truth in fiction/pretense, has been the object of extended
    scrutiny among philosophers and logicians in recent decades. Comparatively little
    attention, however, has been paid to its inferential relationships with time and
    with certain deliberate and contingent human activities, namely, the creation
    of fictional works. The aim of the paper is to contribute to filling the gap.
    Toward this goal, a formal framework is outlined that is consistent with a variety
    of conceptions of fictional truth and based upon a specific formal treatment of
    time and agency, that of so-called stit logics. Moreover, a complete axiomatic
    theory of fiction-making TFM is defined, where fiction-making is understood as
    the exercise of agency and choice in time over what is fictionally true. The language
    \ of TFM is an extension of the language of propositional logic, with the addition
    of temporal and modal operators. A distinctive feature of \ with respect to other
    modal languages is a variety of operators having to do with fictional truth, including
    a ‘fictionality’ operator \ . Some applications of TFM are outlined, and some
    interesting linguistic and inferential phenomena, which are not so easily dealt
    with in other frameworks, are accounted for
  - 'We have structured our response according to five questions arising from the
    commentaries: (i) What is sentience? (ii) Is sentience a necessary or sufficient
    condition for moral standing? (iii) What methods should guide comparative cognitive
    research in general, and specifically in studying invertebrates? (iv) How should
    we balance scientific uncertainty and moral risk? (v) What practical strategies
    can help reduce biases and morally dismissive attitudes toward invertebrates?'
  - 'In 2007, ten world-renowned neuroscientists proposed “A Decade of the Mind Initiative.”
    The contention was that, despite the successes of the Decade of the Brain, “a
    fundamental understanding of how the brain gives rise to the mind [was] still
    lacking” (2007, 1321). The primary aims of the decade of the mind were “to build
    on the progress of the recent Decade of the Brain (1990-99)” by focusing on “four
    broad but intertwined areas” of research, including: healing and protecting, understanding,
    enriching, and modeling the mind. These four aims were to be the result of “transdisciplinary
    and multiagency” research spanning “across disparate fields, such as cognitive
    science, medicine, neuroscience, psychology, mathematics, engineering, and computer
    science.” The proposal for a decade of the mind prompted many questions (See Spitzer
    2008). In this chapter, I address three of them: (1) How do proponents of this
    new decade conceive of the mind? (2) Why should a decade be devoted to understanding
    it? (3) What should this decade look like?'
- source_sentence: This essay explores the historical and modern perspectives on the
    Gettier problem, highlighting the connections between this issue, skepticism,
    and relevance. Through methods such as historical analysis, induction, and deduction,
    it is found that while contextual theories and varying definitions of knowledge
    do not fully address skeptical challenges, they can help clarify our understanding
    of knowledge. Ultimately, embracing subjectivity and intuition can provide insight
    into what it truly means to claim knowledge.
  sentences:
  - In this article I present and analyze three popular moral justifications for hunting.
    My purpose is to expose the moral terrain of this issue and facilitate more fruitful,
    philosophically relevant discussions about the ethics of hunting.
  - Teaching competency in bioethics has been a concern since the field's inception.
    The first report on the teaching of contemporary bioethics was published in 1976
    by The Hastings Center, which concluded that graduate programs were not necessary
    at the time. However, the report speculated that future developments may require
    new academic structures for graduate education in bioethics. The creation of a
    terminal degree in bioethics has its critics, with scholars debating whether bioethics
    is a discipline with its own methods and theoretical grounding, a multidisciplinary
    field, or something else entirely. Despite these debates, new bioethics training
    programs have emerged at all postsecondary levels in the U.S. This essay examines
    the number and types of programs and degrees in this growing field.
  - 'Objective: In this essay,  I will try to track some historical and modern stages
    of the discussion on the Gettier problem, and point out the interrelations of
    the questions that this problem raises for epistemologists, with sceptical arguments,
    and a so-called problem of relevance. Methods: historical analysis, induction,
    generalization, deduction, discourse, intuition results: Albeit the contextual
    theories of knowledge, the use of different definitions of knowledge, and the
    different ways of the uses of knowledge do not resolve all the issues that the
    sceptic can put forward, but they can be productive in giving clarity to a concept
    of knowledge for us. On the other hand, our knowledge will always have an element
    of intuition and subjectivity, however not equating to epistemic luck and probability.  Significance
    novelty: the approach to the context in general, not giving up being a Subject
    may give us a clarity about the sense of what it means to say – “I know”.'
model-index:
- name: SentenceTransformer based on dbourget/pb-ds1-48K
  results:
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts dev
      type: sts-dev
    metrics:
    - type: pearson_cosine
      value: 0.9378177365442741
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.8943299298202461
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.9709949018414847
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.8969442622028955
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.9711044669329696
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.8966133108746955
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.9419649751470724
      name: Pearson Dot
    - type: spearman_dot
      value: 0.8551487313582053
      name: Spearman Dot
    - type: pearson_max
      value: 0.9711044669329696
      name: Pearson Max
    - type: spearman_max
      value: 0.8969442622028955
      name: Spearman Max
---

# SentenceTransformer based on dbourget/pb-ds1-48K

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [dbourget/pb-ds1-48K](https://huggingface.co/dbourget/pb-ds1-48K). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [dbourget/pb-ds1-48K](https://huggingface.co/dbourget/pb-ds1-48K) <!-- at revision fcd4aeedcdc3ad836820d47fd28ffd2529914647 -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 tokens
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("dbourget/pb-ds1-48K-philsim")
# Run inference
sentences = [
    'This essay explores the historical and modern perspectives on the Gettier problem, highlighting the connections between this issue, skepticism, and relevance. Through methods such as historical analysis, induction, and deduction, it is found that while contextual theories and varying definitions of knowledge do not fully address skeptical challenges, they can help clarify our understanding of knowledge. Ultimately, embracing subjectivity and intuition can provide insight into what it truly means to claim knowledge.',
    'Objective: In this essay,  I will try to track some historical and modern stages of the discussion on the Gettier problem, and point out the interrelations of the questions that this problem raises for epistemologists, with sceptical arguments, and a so-called problem of relevance. Methods: historical analysis, induction, generalization, deduction, discourse, intuition results: Albeit the contextual theories of knowledge, the use of different definitions of knowledge, and the different ways of the uses of knowledge do not resolve all the issues that the sceptic can put forward, but they can be productive in giving clarity to a concept of knowledge for us. On the other hand, our knowledge will always have an element of intuition and subjectivity, however not equating to epistemic luck and probability.  Significance novelty: the approach to the context in general, not giving up being a Subject may give us a clarity about the sense of what it means to say – “I know”.',
    "Teaching competency in bioethics has been a concern since the field's inception. The first report on the teaching of contemporary bioethics was published in 1976 by The Hastings Center, which concluded that graduate programs were not necessary at the time. However, the report speculated that future developments may require new academic structures for graduate education in bioethics. The creation of a terminal degree in bioethics has its critics, with scholars debating whether bioethics is a discipline with its own methods and theoretical grounding, a multidisciplinary field, or something else entirely. Despite these debates, new bioethics training programs have emerged at all postsecondary levels in the U.S. This essay examines the number and types of programs and degrees in this growing field.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Semantic Similarity
* Dataset: `sts-dev`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)

| Metric              | Value      |
|:--------------------|:-----------|
| pearson_cosine      | 0.9378     |
| **spearman_cosine** | **0.8943** |
| pearson_manhattan   | 0.971      |
| spearman_manhattan  | 0.8969     |
| pearson_euclidean   | 0.9711     |
| spearman_euclidean  | 0.8966     |
| pearson_dot         | 0.942      |
| spearman_dot        | 0.8551     |
| pearson_max         | 0.9711     |
| spearman_max        | 0.8969     |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: steps
- `per_device_train_batch_size`: 190
- `per_device_eval_batch_size`: 190
- `learning_rate`: 5e-06
- `num_train_epochs`: 2
- `warmup_ratio`: 0.1
- `bf16`: True
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 190
- `per_device_eval_batch_size`: 190
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `learning_rate`: 5e-06
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 2
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: True
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional

</details>

### Training Logs
<details><summary>Click to expand</summary>

| Epoch  | Step | Training Loss | loss   | sts-dev_spearman_cosine |
|:------:|:----:|:-------------:|:------:|:-----------------------:|
| 0      | 0    | -             | -      | 0.8229                  |
| 0.0178 | 10   | 0.0545        | -      | -                       |
| 0.0355 | 20   | 0.0556        | -      | -                       |
| 0.0533 | 30   | 0.0502        | -      | -                       |
| 0.0710 | 40   | 0.0497        | -      | -                       |
| 0.0888 | 50   | 0.0413        | -      | -                       |
| 0.1066 | 60   | 0.0334        | -      | -                       |
| 0.1243 | 70   | 0.0238        | -      | -                       |
| 0.1421 | 80   | 0.0206        | -      | -                       |
| 0.1599 | 90   | 0.0167        | -      | -                       |
| 0.1776 | 100  | 0.0146        | 0.0725 | 0.8788                  |
| 0.1954 | 110  | 0.0127        | -      | -                       |
| 0.2131 | 120  | 0.0125        | -      | -                       |
| 0.2309 | 130  | 0.0115        | -      | -                       |
| 0.2487 | 140  | 0.0116        | -      | -                       |
| 0.2664 | 150  | 0.0111        | -      | -                       |
| 0.2842 | 160  | 0.0107        | -      | -                       |
| 0.3020 | 170  | 0.0113        | -      | -                       |
| 0.3197 | 180  | 0.0106        | -      | -                       |
| 0.3375 | 190  | 0.0099        | -      | -                       |
| 0.3552 | 200  | 0.0092        | 0.0207 | 0.8856                  |
| 0.3730 | 210  | 0.0097        | -      | -                       |
| 0.3908 | 220  | 0.0099        | -      | -                       |
| 0.4085 | 230  | 0.0087        | -      | -                       |
| 0.4263 | 240  | 0.0087        | -      | -                       |
| 0.4440 | 250  | 0.0082        | -      | -                       |
| 0.4618 | 260  | 0.0083        | -      | -                       |
| 0.4796 | 270  | 0.0089        | -      | -                       |
| 0.4973 | 280  | 0.0082        | -      | -                       |
| 0.5151 | 290  | 0.0078        | -      | -                       |
| 0.5329 | 300  | 0.0081        | 0.0078 | 0.8891                  |
| 0.5506 | 310  | 0.0081        | -      | -                       |
| 0.5684 | 320  | 0.0072        | -      | -                       |
| 0.5861 | 330  | 0.0084        | -      | -                       |
| 0.6039 | 340  | 0.0083        | -      | -                       |
| 0.6217 | 350  | 0.0078        | -      | -                       |
| 0.6394 | 360  | 0.0077        | -      | -                       |
| 0.6572 | 370  | 0.008         | -      | -                       |
| 0.6750 | 380  | 0.0073        | -      | -                       |
| 0.6927 | 390  | 0.008         | -      | -                       |
| 0.7105 | 400  | 0.0073        | 0.0058 | 0.8890                  |
| 0.7282 | 410  | 0.0075        | -      | -                       |
| 0.7460 | 420  | 0.0077        | -      | -                       |
| 0.7638 | 430  | 0.0074        | -      | -                       |
| 0.7815 | 440  | 0.0073        | -      | -                       |
| 0.7993 | 450  | 0.007         | -      | -                       |
| 0.8171 | 460  | 0.0043        | -      | -                       |
| 0.8348 | 470  | 0.0052        | -      | -                       |
| 0.8526 | 480  | 0.0046        | -      | -                       |
| 0.8703 | 490  | 0.0073        | -      | -                       |
| 0.8881 | 500  | 0.0056        | 0.0069 | 0.8922                  |
| 0.9059 | 510  | 0.0059        | -      | -                       |
| 0.9236 | 520  | 0.0045        | -      | -                       |
| 0.9414 | 530  | 0.0033        | -      | -                       |
| 0.9591 | 540  | 0.0058        | -      | -                       |
| 0.9769 | 550  | 0.0056        | -      | -                       |
| 0.9947 | 560  | 0.0046        | -      | -                       |
| 1.0124 | 570  | 0.003         | -      | -                       |
| 1.0302 | 580  | 0.0039        | -      | -                       |
| 1.0480 | 590  | 0.0032        | -      | -                       |
| 1.0657 | 600  | 0.0031        | 0.0029 | 0.8931                  |
| 1.0835 | 610  | 0.0046        | -      | -                       |
| 1.1012 | 620  | 0.003         | -      | -                       |
| 1.1190 | 630  | 0.0021        | -      | -                       |
| 1.1368 | 640  | 0.0031        | -      | -                       |
| 1.1545 | 650  | 0.0035        | -      | -                       |
| 1.1723 | 660  | 0.0033        | -      | -                       |
| 1.1901 | 670  | 0.0024        | -      | -                       |
| 1.2078 | 680  | 0.0012        | -      | -                       |
| 1.2256 | 690  | 0.0075        | -      | -                       |
| 1.2433 | 700  | 0.0028        | 0.0036 | 0.8945                  |
| 1.2611 | 710  | 0.0033        | -      | -                       |
| 1.2789 | 720  | 0.0023        | -      | -                       |
| 1.2966 | 730  | 0.0034        | -      | -                       |
| 1.3144 | 740  | 0.0018        | -      | -                       |
| 1.3321 | 750  | 0.0016        | -      | -                       |
| 1.3499 | 760  | 0.0025        | -      | -                       |
| 1.3677 | 770  | 0.002         | -      | -                       |
| 1.3854 | 780  | 0.0016        | -      | -                       |
| 1.4032 | 790  | 0.0018        | -      | -                       |
| 1.4210 | 800  | 0.003         | 0.0027 | 0.8944                  |
| 1.4387 | 810  | 0.0018        | -      | -                       |
| 1.4565 | 820  | 0.0008        | -      | -                       |
| 1.4742 | 830  | 0.0014        | -      | -                       |
| 1.4920 | 840  | 0.0025        | -      | -                       |
| 1.5098 | 850  | 0.0026        | -      | -                       |
| 1.5275 | 860  | 0.0012        | -      | -                       |
| 1.5453 | 870  | 0.001         | -      | -                       |
| 1.5631 | 880  | 0.001         | -      | -                       |
| 1.5808 | 890  | 0.0012        | -      | -                       |
| 1.5986 | 900  | 0.0021        | 0.0021 | 0.8952                  |
| 1.6163 | 910  | 0.0016        | -      | -                       |
| 1.6341 | 920  | 0.0008        | -      | -                       |
| 1.6519 | 930  | 0.0008        | -      | -                       |
| 1.6696 | 940  | 0.0009        | -      | -                       |
| 1.6874 | 950  | 0.0004        | -      | -                       |
| 1.7052 | 960  | 0.0003        | -      | -                       |
| 1.7229 | 970  | 0.0007        | -      | -                       |
| 1.7407 | 980  | 0.0007        | -      | -                       |
| 1.7584 | 990  | 0.0011        | -      | -                       |
| 1.7762 | 1000 | 0.0007        | 0.0029 | 0.8952                  |
| 1.7940 | 1010 | 0.0008        | -      | -                       |
| 1.8117 | 1020 | 0.001         | -      | -                       |
| 1.8295 | 1030 | 0.0006        | -      | -                       |
| 1.8472 | 1040 | 0.0006        | -      | -                       |
| 1.8650 | 1050 | 0.0015        | -      | -                       |
| 1.8828 | 1060 | 0.0009        | -      | -                       |
| 1.9005 | 1070 | 0.0005        | -      | -                       |
| 1.9183 | 1080 | 0.0006        | -      | -                       |
| 1.9361 | 1090 | 0.0021        | -      | -                       |
| 1.9538 | 1100 | 0.0009        | 0.0023 | 0.8943                  |
| 1.9716 | 1110 | 0.0007        | -      | -                       |
| 1.9893 | 1120 | 0.0003        | -      | -                       |

</details>

### Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.42.3
- PyTorch: 2.2.0+cu121
- Accelerate: 0.31.0
- Datasets: 2.20.0
- Tokenizers: 0.19.1

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->