sileod commited on
Commit
0a60567
·
verified ·
1 Parent(s): fbb9fc4

Add new SentenceTransformer model

Browse files
Files changed (2) hide show
  1. README.md +560 -163
  2. model.safetensors +1 -1
README.md CHANGED
@@ -6,96 +6,121 @@ tags:
6
  - sentence-similarity
7
  - feature-extraction
8
  - generated_from_trainer
9
- - dataset_size:4731012
10
  - loss:MultipleNegativesRankingLoss
11
  - loss:CachedMultipleNegativesRankingLoss
12
  - loss:SoftmaxLoss
13
  - loss:CosineSimilarityLoss
14
  base_model: tasksource/ModernBERT-base-nli
15
  widget:
16
- - source_sentence: Christa McAuliffe taught social studies at Concord High School.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  sentences:
18
- - The Football League play-offs for the 1994 -- 95 season were held in May 1995
19
- , with the finals taking place at Wembley Stadium in London .. Football League
20
- play-offs. Football League play-offs. 1994 Football League play-offs. Wembley
21
- Stadium. Wembley Stadium ( 1923 ). London. London. The play-off semi-finals were
22
- played over two legs and were contested by the teams who finished in 2nd , 3rd
23
- , 4th and 5th place in the Football League First Division and Football League
24
- Second Division and the 3rd , 4th , 5th , and 6th placed teams in the Football
25
- League Third Division table .. Football League First Division. 1994–95 Football
26
- League First Division. Football League Second Division. 1994–95 Football League
27
- Second Division. Football League Third Division. 1994–95 Football League Third
28
- Division. The winners of the semi-finals progressed through to the finals , with
29
- the winner of these matches gaining promotion for the following season .. following
30
- season. 1995-96 in English football
31
- - Sir Alexander Mackenzie Elementary is a public elementary school in Vancouver
32
- , British Columbia part of School District 39 Vancouver .. Vancouver. Vancouver,
33
- British Columbia. British Columbia. British Columbia. School District 39 Vancouver.
34
- School District 39 Vancouver. elementary school. elementary school
35
- - 'Help Wanted -LRB- Hataraku Hito : Hard Working People in Japan , Job Island :
36
- Hard Working People in Europe -RRB- is a game that features a collection of various
37
- , Wii Remote-based minigames .. Wii. Wii. Wii Remote. Wii Remote. The game is
38
- developed and published by Hudson Soft and was released in Japan for Nintendo
39
- ''s Wii on November 27 , 2008 , in Europe on March 13 , 2009 , in Australia on
40
- March 27 , 2009 , and in North America on May 12 , 2009 .. Hudson Soft. Hudson
41
- Soft. Wii. Wii. Nintendo. Nintendo'
42
- - source_sentence: The researchers asked children of different ages to use words to
43
- form semantic correspondence. For example, when children see the words eagle,
44
- bear and robin, they combine them best according to their meaning. The results
45
- showed that older participants were more likely to develop different types of
46
- false memory than younger participants. Because there are many forms of classification
47
- in their minds. For example, young children classify eagles and robins as birds,
48
- while older children classify eagles and bears as predators. Compared with children,
49
- they have a concept of predators in their minds.
50
- sentences:
51
- - Extractive Industries Transparency Initiative is an organization
52
- - Mason heard a pun
53
- - Older children are more likely to have false memories than younger ones conforms
54
- to the context.
55
- - source_sentence: 'Version 0.5 is released today. The biggest change is that this
56
- version finally has upload progress.
57
-
58
- Download it here:
59
-
60
- Or go to for more information about this project.
61
-
62
- Changelog:
63
-
64
- * Refactored the authentication_controller
65
-
66
- * Put before_filter :authorize in ApplicationController (and using skip_before_filter
67
- in other controllers if necessary)
68
 
69
- * Using ''unless'' instead of ''if not''
70
 
71
- * Using find_by() instead of find(:first)
72
 
73
- * Upload progress (yay!)
74
 
75
- Forums |
76
-
77
- Admin'
 
 
 
 
78
  sentences:
79
- - This example wikipedia comment contains an insult.
80
- - 'This text is about: hardware update'
81
- - The example summary is factually consistent with the full article.
82
- - source_sentence: 'Make sure to make it to the Brew House in Pella, IA tomorrow @
83
- 3 to meet with @user supporters! #SemST'
84
  sentences:
85
- - This example is ANT.
86
- - This example is valid question.
87
- - This example is favor.
88
- - source_sentence: Also at increased risk are those whose immune systems suppressed
89
- by medications or by diseases such as cancer, diabetes and AIDS.
90
  sentences:
91
- - In 1995, the last survey, those numbers were equal.
92
- - Also at increased risk are those with suppressed immune systems due to illness
93
- or medicines.
94
- - Singapore stocks close 0.54 pct higher
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
  datasets:
96
  - tomaarsen/natural-questions-hard-negatives
97
  - tomaarsen/gooaq-hard-negatives
98
  - bclavie/msmarco-500k-triplets
 
99
  - sentence-transformers/gooaq
100
  - sentence-transformers/natural-questions
101
  - tasksource/merged-2l-nli
@@ -112,7 +137,7 @@ library_name: sentence-transformers
112
 
113
  # SentenceTransformer based on tasksource/ModernBERT-base-nli
114
 
115
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [tasksource/ModernBERT-base-nli](https://huggingface.co/tasksource/ModernBERT-base-nli) on the [tomaarsen/natural-questions-hard-negatives](https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives), [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives), [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets), [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq), [sentence-transformers/natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions), [merged-2l-nli](https://huggingface.co/datasets/tasksource/merged-2l-nli), [merged-3l-nli](https://huggingface.co/datasets/tasksource/merged-3l-nli), [zero-shot-label-nli](https://huggingface.co/datasets/tasksource/zero-shot-label-nli), [dataset_train_nli](https://huggingface.co/datasets/MoritzLaurer/dataset_train_nli), [paws/labeled_final](https://huggingface.co/datasets/paws), [glue/mrpc](https://huggingface.co/datasets/glue), [glue/qqp](https://huggingface.co/datasets/glue), [fever-evidence-related](https://huggingface.co/datasets/mwong/fever-evidence-related), [glue/stsb](https://huggingface.co/datasets/glue), sick/relatedness and [sts-companion](https://huggingface.co/datasets/tasksource/sts-companion) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
116
 
117
  ## Model Details
118
 
@@ -126,6 +151,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [t
126
  - [tomaarsen/natural-questions-hard-negatives](https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives)
127
  - [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives)
128
  - [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets)
 
129
  - [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
130
  - [sentence-transformers/natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions)
131
  - [merged-2l-nli](https://huggingface.co/datasets/tasksource/merged-2l-nli)
@@ -175,9 +201,9 @@ from sentence_transformers import SentenceTransformer
175
  model = SentenceTransformer("tasksource/ModernBERT-base-embed")
176
  # Run inference
177
  sentences = [
178
- 'Also at increased risk are those whose immune systems suppressed by medications or by diseases such as cancer, diabetes and AIDS.',
179
- 'Also at increased risk are those with suppressed immune systems due to illness or medicines.',
180
- 'In 1995, the last survey, those numbers were equal.',
181
  ]
182
  embeddings = model.encode(sentences)
183
  print(embeddings.shape)
@@ -256,7 +282,7 @@ You can finetune this model on your own dataset.
256
  #### tomaarsen/gooaq-hard-negatives
257
 
258
  * Dataset: [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives) at [87594a1](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives/tree/87594a1e6c58e88b5843afa9da3a97ffd75d01c2)
259
- * Size: 200,000 training samples
260
  * Columns: <code>question</code>, <code>answer</code>, <code>negative_1</code>, <code>negative_2</code>, <code>negative_3</code>, <code>negative_4</code>, and <code>negative_5</code>
261
  * Approximate statistics based on the first 1000 samples:
262
  | | question | answer | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 |
@@ -280,7 +306,7 @@ You can finetune this model on your own dataset.
280
  #### bclavie/msmarco-500k-triplets
281
 
282
  * Dataset: [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets) at [cb1a85c](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets/tree/cb1a85c1261fa7c65f4ea43f94e50f8b467c372f)
283
- * Size: 200,000 training samples
284
  * Columns: <code>query</code>, <code>positive</code>, and <code>negative</code>
285
  * Approximate statistics based on the first 1000 samples:
286
  | | query | positive | negative |
@@ -301,10 +327,34 @@ You can finetune this model on your own dataset.
301
  }
302
  ```
303
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
304
  #### sentence-transformers/gooaq
305
 
306
  * Dataset: [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
307
- * Size: 200,000 training samples
308
  * Columns: <code>question</code> and <code>answer</code>
309
  * Approximate statistics based on the first 1000 samples:
310
  | | question | answer |
@@ -355,16 +405,16 @@ You can finetune this model on your own dataset.
355
  * Size: 425,243 training samples
356
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
357
  * Approximate statistics based on the first 1000 samples:
358
- | | sentence1 | sentence2 | label |
359
- |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
360
- | type | string | string | int |
361
- | details | <ul><li>min: 4 tokens</li><li>mean: 83.27 tokens</li><li>max: 1202 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 16.7 tokens</li><li>max: 126 tokens</li></ul> | <ul><li>0: ~52.90%</li><li>1: ~47.10%</li></ul> |
362
  * Samples:
363
- | sentence1 | sentence2 | label |
364
- |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------|:---------------|
365
- | <code>In 1783 , the Sunni Al-Khalifa family captured Bahrain from the Persians .</code> | <code>The is a geographical/political entity</code> | <code>0</code> |
366
- | <code>::stage Egg:: Newt eggs are encased in a gel-like substance rather than a hard shell. Adult females release eggs one at a time and store them in clusters ranging from a handful to several dozen in size. Adults often take an active role in defending their eggs after depositing them. Mothers may curl their body around the eggs to provide protection. Some newt species even wrap leaves around each egg individually to camouflage them, according to San Diego Zoo. Newt eggs are small: some measure only a millimeter or two in diameter. Mom usually anchors her eggs to underwater plants and other structures to keep them safe. ::stage Tadpole:: Newts that hatch from submerged eggs usually emerge as aquatic larvae with fishlike tails and gills that allow them to breathe beneath the water's surface. Not all newt species have an aquatic or 'tadpole' phase. This tadpole stage tends to be short, except in fully aquatic species. Eastern newt (Notophthalmus viridescens) larvae spend only a few months as...</code> | <code>Tadpole thing is a newt's terrestrial larval phase known as.</code> | <code>0</code> |
367
- | <code>Target <br><br>You are now a valid target, you nasty little shit! 86.176.169.49</code> | <code>This example wikipedia comment contains an insult.</code> | <code>1</code> |
368
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
369
 
370
  #### merged-3l-nli
@@ -376,13 +426,13 @@ You can finetune this model on your own dataset.
376
  | | sentence1 | sentence2 | label |
377
  |:--------|:-------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
378
  | type | string | string | int |
379
- | details | <ul><li>min: 5 tokens</li><li>mean: 110.76 tokens</li><li>max: 2048 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 28.37 tokens</li><li>max: 485 tokens</li></ul> | <ul><li>0: ~36.00%</li><li>1: ~32.70%</li><li>2: ~31.30%</li></ul> |
380
  * Samples:
381
- | sentence1 | sentence2 | label |
382
- |:-----------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
383
- | <code>Iceland does not have a high latitude.</code> | <code>Iceland . Iceland is warmed by the Gulf Stream and has a temperate climate , despite a high latitude just outside the Arctic Circle . Its high latitude and marine influence still keeps summers chilly , with most of the archipelago having a tundra climate .</code> | <code>2</code> |
384
- | <code>The populist, by contrast, panders to his audience, figuring out what it likes and then delivering it in heaps.</code> | <code>Populists hate the audience and antagonizes them; so their support, as a tyrant's, is similarly lacking.</code> | <code>2</code> |
385
- | <code>The prison sentence of that convict will end after 2 months.</code> | <code>Before 212 days, the prison sentence of that convict will end.</code> | <code>1</code> |
386
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
387
 
388
  #### zero-shot-label-nli
@@ -394,13 +444,13 @@ You can finetune this model on your own dataset.
394
  | | label | sentence1 | sentence2 |
395
  |:--------|:------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
396
  | type | int | string | string |
397
- | details | <ul><li>0: ~50.20%</li><li>2: ~49.80%</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 66.27 tokens</li><li>max: 2048 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 8.07 tokens</li><li>max: 17 tokens</li></ul> |
398
  * Samples:
399
- | label | sentence1 | sentence2 |
400
- |:---------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------|
401
- | <code>0</code> | <code>The LAY MAN! Just to let you know you are missed and thought off. Do have a great day. And if you can send me bimbo and ugo's numbers, ill appreciate. Safe<br></code> | <code>This example is ham.</code> |
402
- | <code>2</code> | <code>Crisp: oh really!!!!</code> | <code>This example is Automotive.</code> |
403
- | <code>2</code> | <code>Insurance policies should be simple .</code> | <code>This example is negative.</code> |
404
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
405
 
406
  #### dataset_train_nli
@@ -463,16 +513,16 @@ You can finetune this model on your own dataset.
463
  * Size: 363,846 training samples
464
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
465
  * Approximate statistics based on the first 1000 samples:
466
- | | sentence1 | sentence2 | label |
467
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
468
- | type | string | string | int |
469
- | details | <ul><li>min: 4 tokens</li><li>mean: 15.63 tokens</li><li>max: 50 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.55 tokens</li><li>max: 76 tokens</li></ul> | <ul><li>0: ~62.70%</li><li>1: ~37.30%</li></ul> |
470
  * Samples:
471
- | sentence1 | sentence2 | label |
472
- |:-------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------|:---------------|
473
- | <code>How can I stop my laptop from hibernating in windows 10?</code> | <code>How do I shutdown windows 10 instead of hibernating it?</code> | <code>0</code> |
474
- | <code>Is it worth the cost if ever I fix my gap teeth?</code> | <code>Is it worth it to fix teeth gap?</code> | <code>1</code> |
475
- | <code>Why is USA the biggest threat to the global economy and Germany is not?</code> | <code>What is the biggest threat to the global economy over the next year (in 2011)?</code> | <code>0</code> |
476
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
477
 
478
  #### fever-evidence-related
@@ -481,16 +531,16 @@ You can finetune this model on your own dataset.
481
  * Size: 403,218 training samples
482
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
483
  * Approximate statistics based on the first 1000 samples:
484
- | | sentence1 | sentence2 | label |
485
- |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:------------------------------------------------|
486
- | type | string | string | int |
487
- | details | <ul><li>min: 7 tokens</li><li>mean: 13.58 tokens</li><li>max: 59 tokens</li></ul> | <ul><li>min: 33 tokens</li><li>mean: 344.03 tokens</li><li>max: 2048 tokens</li></ul> | <ul><li>0: ~32.00%</li><li>1: ~68.00%</li></ul> |
488
  * Samples:
489
- | sentence1 | sentence2 | label |
490
- |:--------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
491
- | <code>The Last of Us Part II had the developer Naughty Dog.</code> | <code>Bishop Asbury Cottage is a 17th-century cottage on Newton Road , Great Barr , England , known for being the boyhood home of Francis Asbury -LRB- 1745 -- 1816 -RRB- , one of the first two bishops of the Methodist Episcopal Church -LRB- now The United Methodist Church -RRB- in the United States .. Cottage. Cottage. Great Barr. Great Barr. England. England. Francis Asbury. Francis Asbury. bishops. Bishop ( Methodism ). Methodist Episcopal Church. Methodist Episcopal Church. The United Methodist Church. The United Methodist Church. It is now a museum in his memory .</code> | <code>1</code> |
492
- | <code>Boomerang (1992 film) was released on July.</code> | <code>Petr Alekseyevich Bezobrazov -LRB- 29 January 1845 -- 17 July 1906 -RRB- was an admiral in the Imperial Russian Navy .. Imperial Russian Navy. Imperial Russian Navy</code> | <code>1</code> |
493
- | <code>G-Dragon was the first Korean solo artist to a type of tour.</code> | <code>The Scott Viking 2 was the first British high performance two seat sailplane , flying a few days before the outbreak of World War II .. World War II. World War II. Only one was built ; it was used in radar station trials in the Summer of 1940 .</code> | <code>1</code> |
494
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
495
 
496
  #### glue/stsb
@@ -499,16 +549,16 @@ You can finetune this model on your own dataset.
499
  * Size: 5,749 training samples
500
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
501
  * Approximate statistics based on the first 1000 samples:
502
- | | sentence1 | sentence2 | label |
503
- |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
504
- | type | string | string | float |
505
- | details | <ul><li>min: 6 tokens</li><li>mean: 15.0 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.02 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 2.73</li><li>max: 5.0</li></ul> |
506
  * Samples:
507
- | sentence1 | sentence2 | label |
508
- |:--------------------------------------------------------------------------|:------------------------------------------------------------------------|:-------------------------------|
509
- | <code>Syria peace plan conditions “unacceptable,” opposition says</code> | <code>Syria peace dashed as deadline passes</code> | <code>2.0</code> |
510
- | <code>Romney picks Ryan as vice presidential running mate: source</code> | <code>Romney to tap Ryan as vice presidential running mate</code> | <code>5.0</code> |
511
- | <code>Death toll rises to 6 as Storm Xaver batters northern Europe</code> | <code>Storm death toll rises as wind, rain batters north. Europe</code> | <code>3.200000047683716</code> |
512
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
513
  ```json
514
  {
@@ -522,16 +572,16 @@ You can finetune this model on your own dataset.
522
  * Size: 4,439 training samples
523
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
524
  * Approximate statistics based on the first 1000 samples:
525
- | | sentence1 | sentence2 | label |
526
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------|
527
- | type | string | string | float |
528
- | details | <ul><li>min: 6 tokens</li><li>mean: 12.08 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 11.86 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 3.5</li><li>max: 5.0</li></ul> |
529
  * Samples:
530
- | sentence1 | sentence2 | label |
531
- |:-------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------|:--------------------------------|
532
- | <code>The man is standing on a rocky mountain and gray clouds are in the background</code> | <code>A black topless person is packing a pile of rocks and a front of clouds are in the background</code> | <code>2.9000000953674316</code> |
533
- | <code>A man is standing on a dirt hill next to a black jeep</code> | <code>A man in a hat is standing outside of a green vehicle</code> | <code>2.5999999046325684</code> |
534
- | <code>A man is talking on a cell phone</code> | <code>A man is making a phone call</code> | <code>4.300000190734863</code> |
535
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
536
  ```json
537
  {
@@ -548,13 +598,13 @@ You can finetune this model on your own dataset.
548
  | | label | sentence1 | sentence2 |
549
  |:--------|:---------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
550
  | type | float | string | string |
551
- | details | <ul><li>min: 0.0</li><li>mean: 3.02</li><li>max: 5.0</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 17.69 tokens</li><li>max: 60 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 16.26 tokens</li><li>max: 51 tokens</li></ul> |
552
  * Samples:
553
- | label | sentence1 | sentence2 |
554
- |:------------------|:-----------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------|
555
- | <code>4.25</code> | <code>It then appointed a task force to formulate the necessary changes in tax and spending policies.</code> | <code>He has appointed a working party to make the necessary changes to the policies of public spending and fiscal policies.</code> |
556
- | <code>4.25</code> | <code>festive social event, celebration</code> | <code>an occasion on which people can assemble for social interaction and entertainment.</code> |
557
- | <code>3.6</code> | <code>Who'd have thought an American hero could be a Canadian? NYT: Man Who Sheltered Americans in Tehran, Dies at 88</code> | <code>John Sheardown, Canadian Who Sheltered Americans in Tehran, Dies at 88</code> |
558
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
559
  ```json
560
  {
@@ -781,7 +831,7 @@ You can finetune this model on your own dataset.
781
  #### Non-Default Hyperparameters
782
 
783
  - `per_device_train_batch_size`: 24
784
- - `learning_rate`: 2e-05
785
  - `weight_decay`: 1e-06
786
  - `num_train_epochs`: 1
787
  - `warmup_ratio`: 0.1
@@ -801,7 +851,7 @@ You can finetune this model on your own dataset.
801
  - `gradient_accumulation_steps`: 1
802
  - `eval_accumulation_steps`: None
803
  - `torch_empty_cache_steps`: None
804
- - `learning_rate`: 2e-05
805
  - `weight_decay`: 1e-06
806
  - `adam_beta1`: 0.9
807
  - `adam_beta2`: 0.999
@@ -909,26 +959,373 @@ You can finetune this model on your own dataset.
909
  </details>
910
 
911
  ### Training Logs
912
- | Epoch | Step | Training Loss |
913
- |:------:|:----:|:-------------:|
914
- | 0.0067 | 500 | 10.6192 |
915
- | 0.0134 | 1000 | 1.9196 |
916
- | 0.0202 | 1500 | 1.0304 |
917
- | 0.0269 | 2000 | 0.9269 |
918
- | 0.0336 | 2500 | 0.7738 |
919
- | 0.0403 | 3000 | 0.7092 |
920
- | 0.0471 | 3500 | 0.6571 |
921
- | 0.0538 | 4000 | 0.6408 |
922
- | 0.0605 | 4500 | 0.6348 |
923
- | 0.0672 | 5000 | 0.5927 |
924
- | 0.0739 | 5500 | 0.5848 |
925
- | 0.0807 | 6000 | 0.5542 |
926
- | 0.0874 | 6500 | 0.558 |
927
- | 0.0941 | 7000 | 0.5394 |
928
- | 0.1008 | 7500 | 0.5632 |
929
- | 0.1076 | 8000 | 0.5037 |
930
- | 0.1143 | 8500 | 0.5278 |
931
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
932
 
933
  ### Framework Versions
934
  - Python: 3.11.4
 
6
  - sentence-similarity
7
  - feature-extraction
8
  - generated_from_trainer
9
+ - dataset_size:6131012
10
  - loss:MultipleNegativesRankingLoss
11
  - loss:CachedMultipleNegativesRankingLoss
12
  - loss:SoftmaxLoss
13
  - loss:CosineSimilarityLoss
14
  base_model: tasksource/ModernBERT-base-nli
15
  widget:
16
+ - source_sentence: Daniel went to the kitchen. Sandra went back to the kitchen. Daniel
17
+ moved to the garden. Sandra grabbed the apple. Sandra went back to the office.
18
+ Sandra dropped the apple. Sandra went to the garden. Sandra went back to the bedroom.
19
+ Sandra went back to the office. Mary went back to the office. Daniel moved to
20
+ the bathroom. Sandra grabbed the apple. Sandra travelled to the garden. Sandra
21
+ put down the apple there. Mary went back to the bathroom. Daniel travelled to
22
+ the garden. Mary took the milk. Sandra grabbed the apple. Mary left the milk there.
23
+ Sandra journeyed to the bedroom. John travelled to the office. John went back
24
+ to the garden. Sandra journeyed to the garden. Mary grabbed the milk. Mary left
25
+ the milk. Mary grabbed the milk. Mary went to the hallway. John moved to the hallway.
26
+ Mary picked up the football. Sandra journeyed to the kitchen. Sandra left the
27
+ apple. Mary discarded the milk. John journeyed to the garden. Mary dropped the
28
+ football. Daniel moved to the bathroom. Daniel journeyed to the kitchen. Mary
29
+ travelled to the bathroom. Daniel went to the bedroom. Mary went to the hallway.
30
+ Sandra got the apple. Sandra went back to the hallway. Mary moved to the kitchen.
31
+ Sandra dropped the apple there. Sandra grabbed the milk. Sandra journeyed to the
32
+ bathroom. John went back to the kitchen. Sandra went to the kitchen. Sandra travelled
33
+ to the bathroom. Daniel went to the garden. Daniel moved to the kitchen. Sandra
34
+ dropped the milk. Sandra got the milk. Sandra put down the milk. John journeyed
35
+ to the garden. Sandra went back to the hallway. Sandra picked up the apple. Sandra
36
+ got the football. Sandra moved to the garden. Daniel moved to the bathroom. Daniel
37
+ travelled to the garden. Sandra went back to the bathroom. Sandra discarded the
38
+ football.
39
  sentences:
40
+ - In the adulthood stage, it can jump, walk, run
41
+ - The chocolate is bigger than the container.
42
+ - The football before the bathroom was in the garden.
43
+ - source_sentence: 'Context: I am devasted.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
+ Speaker 1: I am very devastated these days.
46
 
47
+ Speaker 2: That seems bad and I am sorry to hear that. What happened?
48
 
49
+ Speaker 1: My father day 3 weeks ago.I still can''t believe.
50
 
51
+ Speaker 2: I am truly sorry to hear that. Please accept my apologies for your
52
+ loss. May he rest in peace'
53
+ sentences:
54
+ - 'The main emotion of this example dialogue is: content'
55
+ - 'This text is about: genealogy'
56
+ - The intent of this example is to be offensive/disrespectful.
57
+ - source_sentence: in three distinguish’d parts, with three distinguish’d guides
58
  sentences:
59
+ - This example is paraphrase.
60
+ - This example is neutral.
61
+ - This example is negative.
62
+ - source_sentence: A boy is playing a piano.
 
63
  sentences:
64
+ - Nine killed in Syrian-linked clashes in Lebanon
65
+ - A man is singing and playing a guitar.
66
+ - My opinion is to wait until the child itself expresses a desire for this.
67
+ - source_sentence: Francis I of France was a king.
 
68
  sentences:
69
+ - The Apple QuickTake -LRB- codenamed Venus , Mars , Neptune -RRB- is one of the
70
+ first consumer digital camera lines .. digital camera. digital camera. It was
71
+ launched in 1994 by Apple Computer and was marketed for three years before being
72
+ discontinued in 1997 .. Apple Computer. Apple Computer. Three models of the product
73
+ were built including the 100 and 150 , both built by Kodak ; and the 200 , built
74
+ by Fujifilm .. Kodak. Kodak. Fujifilm. Fujifilm. The QuickTake cameras had a resolution
75
+ of 640 x 480 pixels maximum -LRB- 0.3 Mpx -RRB- .. resolution. Display resolution.
76
+ The 200 model is only officially compatible with the Apple Macintosh for direct
77
+ connections , while the 100 and 150 model are compatible with both the Apple Macintosh
78
+ and Microsoft Windows .. Apple Macintosh. Apple Macintosh. Microsoft Windows.
79
+ Microsoft Windows. Because the QuickTake 200 is almost identical to the Fuji DS-7
80
+ or to Samsung 's Kenox SSC-350N , Fuji 's software for that camera can be used
81
+ to gain Windows compatibility for the QuickTake 200 .. Some other software replacements
82
+ also exist as well as using an external reader for the removable media of the
83
+ QuickTake 200 .. Time Magazine profiled QuickTake as `` the first consumer digital
84
+ camera '' and ranked it among its `` 100 greatest and most influential gadgets
85
+ from 1923 to the present '' list .. digital camera. digital camera. Time Magazine.
86
+ Time Magazine. While the QuickTake was probably the first digicam to have wide
87
+ success , technically this is not true as the greyscale Dycam Model 1 -LRB- also
88
+ marketed as the Logitech FotoMan -RRB- was the first consumer digital camera to
89
+ be sold in the US in November 1990 .. digital camera. digital camera. greyscale.
90
+ greyscale. At least one other camera , the Fuji DS-X , was sold in Japan even
91
+ earlier , in late 1989 .
92
+ - The ganglion cell layer -LRB- ganglionic layer -RRB- is a layer of the retina
93
+ that consists of retinal ganglion cells and displaced amacrine cells .. retina.
94
+ retina. In the macula lutea , the layer forms several strata .. macula lutea.
95
+ macula lutea. The cells are somewhat flask-shaped ; the rounded internal surface
96
+ of each resting on the stratum opticum , and sending off an axon which is prolonged
97
+ into it .. flask. Laboratory flask. stratum opticum. stratum opticum. axon. axon.
98
+ From the opposite end numerous dendrites extend into the inner plexiform layer
99
+ , where they branch and form flattened arborizations at different levels .. inner
100
+ plexiform layer. inner plexiform layer. arborizations. arborizations. dendrites.
101
+ dendrites. The ganglion cells vary much in size , and the dendrites of the smaller
102
+ ones as a rule arborize in the inner plexiform layer as soon as they enter it
103
+ ; while those of the larger cells ramify close to the inner nuclear layer .. inner
104
+ plexiform layer. inner plexiform layer. dendrites. dendrites. inner nuclear layer.
105
+ inner nuclear layer
106
+ - Coyote was a brand of racing chassis designed and built for the use of A. J. Foyt
107
+ 's race team in USAC Championship car racing including the Indianapolis 500 ..
108
+ A. J. Foyt. A. J. Foyt. USAC. United States Auto Club. Championship car. American
109
+ Championship car racing. Indianapolis 500. Indianapolis 500. It was used from
110
+ 1966 to 1983 with Foyt himself making 141 starts in the car , winning 25 times
111
+ .. George Snider had the second most starts with 24 .. George Snider. George Snider.
112
+ Jim McElreath has the only other win with a Coyote chassis .. Jim McElreath. Jim
113
+ McElreath. Foyt drove a Coyote to victory in the Indy 500 in 1967 and 1977 ..
114
+ With Foyt 's permission , fellow Indy 500 champion Eddie Cheever 's Cheever Racing
115
+ began using the Coyote name for his new Daytona Prototype chassis , derived from
116
+ the Fabcar chassis design that he had purchased the rights to in 2007 .. Eddie
117
+ Cheever. Eddie Cheever. Cheever Racing. Cheever Racing. Daytona Prototype. Daytona
118
+ Prototype
119
  datasets:
120
  - tomaarsen/natural-questions-hard-negatives
121
  - tomaarsen/gooaq-hard-negatives
122
  - bclavie/msmarco-500k-triplets
123
+ - sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1
124
  - sentence-transformers/gooaq
125
  - sentence-transformers/natural-questions
126
  - tasksource/merged-2l-nli
 
137
 
138
  # SentenceTransformer based on tasksource/ModernBERT-base-nli
139
 
140
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [tasksource/ModernBERT-base-nli](https://huggingface.co/tasksource/ModernBERT-base-nli) on the [tomaarsen/natural-questions-hard-negatives](https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives), [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives), [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets), [sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1), [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq), [sentence-transformers/natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions), [merged-2l-nli](https://huggingface.co/datasets/tasksource/merged-2l-nli), [merged-3l-nli](https://huggingface.co/datasets/tasksource/merged-3l-nli), [zero-shot-label-nli](https://huggingface.co/datasets/tasksource/zero-shot-label-nli), [dataset_train_nli](https://huggingface.co/datasets/MoritzLaurer/dataset_train_nli), [paws/labeled_final](https://huggingface.co/datasets/paws), [glue/mrpc](https://huggingface.co/datasets/glue), [glue/qqp](https://huggingface.co/datasets/glue), [fever-evidence-related](https://huggingface.co/datasets/mwong/fever-evidence-related), [glue/stsb](https://huggingface.co/datasets/glue), sick/relatedness and [sts-companion](https://huggingface.co/datasets/tasksource/sts-companion) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
141
 
142
  ## Model Details
143
 
 
151
  - [tomaarsen/natural-questions-hard-negatives](https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives)
152
  - [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives)
153
  - [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets)
154
+ - [sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1)
155
  - [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
156
  - [sentence-transformers/natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions)
157
  - [merged-2l-nli](https://huggingface.co/datasets/tasksource/merged-2l-nli)
 
201
  model = SentenceTransformer("tasksource/ModernBERT-base-embed")
202
  # Run inference
203
  sentences = [
204
+ 'Francis I of France was a king.',
205
+ "Coyote was a brand of racing chassis designed and built for the use of A. J. Foyt 's race team in USAC Championship car racing including the Indianapolis 500 .. A. J. Foyt. A. J. Foyt. USAC. United States Auto Club. Championship car. American Championship car racing. Indianapolis 500. Indianapolis 500. It was used from 1966 to 1983 with Foyt himself making 141 starts in the car , winning 25 times .. George Snider had the second most starts with 24 .. George Snider. George Snider. Jim McElreath has the only other win with a Coyote chassis .. Jim McElreath. Jim McElreath. Foyt drove a Coyote to victory in the Indy 500 in 1967 and 1977 .. With Foyt 's permission , fellow Indy 500 champion Eddie Cheever 's Cheever Racing began using the Coyote name for his new Daytona Prototype chassis , derived from the Fabcar chassis design that he had purchased the rights to in 2007 .. Eddie Cheever. Eddie Cheever. Cheever Racing. Cheever Racing. Daytona Prototype. Daytona Prototype",
206
+ "The Apple QuickTake -LRB- codenamed Venus , Mars , Neptune -RRB- is one of the first consumer digital camera lines .. digital camera. digital camera. It was launched in 1994 by Apple Computer and was marketed for three years before being discontinued in 1997 .. Apple Computer. Apple Computer. Three models of the product were built including the 100 and 150 , both built by Kodak ; and the 200 , built by Fujifilm .. Kodak. Kodak. Fujifilm. Fujifilm. The QuickTake cameras had a resolution of 640 x 480 pixels maximum -LRB- 0.3 Mpx -RRB- .. resolution. Display resolution. The 200 model is only officially compatible with the Apple Macintosh for direct connections , while the 100 and 150 model are compatible with both the Apple Macintosh and Microsoft Windows .. Apple Macintosh. Apple Macintosh. Microsoft Windows. Microsoft Windows. Because the QuickTake 200 is almost identical to the Fuji DS-7 or to Samsung 's Kenox SSC-350N , Fuji 's software for that camera can be used to gain Windows compatibility for the QuickTake 200 .. Some other software replacements also exist as well as using an external reader for the removable media of the QuickTake 200 .. Time Magazine profiled QuickTake as `` the first consumer digital camera '' and ranked it among its `` 100 greatest and most influential gadgets from 1923 to the present '' list .. digital camera. digital camera. Time Magazine. Time Magazine. While the QuickTake was probably the first digicam to have wide success , technically this is not true as the greyscale Dycam Model 1 -LRB- also marketed as the Logitech FotoMan -RRB- was the first consumer digital camera to be sold in the US in November 1990 .. digital camera. digital camera. greyscale. greyscale. At least one other camera , the Fuji DS-X , was sold in Japan even earlier , in late 1989 .",
207
  ]
208
  embeddings = model.encode(sentences)
209
  print(embeddings.shape)
 
282
  #### tomaarsen/gooaq-hard-negatives
283
 
284
  * Dataset: [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives) at [87594a1](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives/tree/87594a1e6c58e88b5843afa9da3a97ffd75d01c2)
285
+ * Size: 500,000 training samples
286
  * Columns: <code>question</code>, <code>answer</code>, <code>negative_1</code>, <code>negative_2</code>, <code>negative_3</code>, <code>negative_4</code>, and <code>negative_5</code>
287
  * Approximate statistics based on the first 1000 samples:
288
  | | question | answer | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 |
 
306
  #### bclavie/msmarco-500k-triplets
307
 
308
  * Dataset: [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets) at [cb1a85c](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets/tree/cb1a85c1261fa7c65f4ea43f94e50f8b467c372f)
309
+ * Size: 500,000 training samples
310
  * Columns: <code>query</code>, <code>positive</code>, and <code>negative</code>
311
  * Approximate statistics based on the first 1000 samples:
312
  | | query | positive | negative |
 
327
  }
328
  ```
329
 
330
+ #### sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1
331
+
332
+ * Dataset: [sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1) at [84ed2d3](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1/tree/84ed2d35626f617d890bd493b4d6db69a741e0e2)
333
+ * Size: 500,000 training samples
334
+ * Columns: <code>query</code>, <code>positive</code>, and <code>negative</code>
335
+ * Approximate statistics based on the first 1000 samples:
336
+ | | query | positive | negative |
337
+ |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
338
+ | type | string | string | string |
339
+ | details | <ul><li>min: 5 tokens</li><li>mean: 9.87 tokens</li><li>max: 16 tokens</li></ul> | <ul><li>min: 44 tokens</li><li>mean: 85.25 tokens</li><li>max: 211 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 81.18 tokens</li><li>max: 227 tokens</li></ul> |
340
+ * Samples:
341
+ | query | positive | negative |
342
+ |:----------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
343
+ | <code>what are the liberal arts?</code> | <code>liberal arts. 1. the academic course of instruction at a college intended to provide general knowledge and comprising the arts, humanities, natural sciences, and social sciences, as opposed to professional or technical subjects.</code> | <code>Rather than preparing students for a specific career, liberal arts programs focus on cultural literacy and hone communication and analytical skills. They often cover various disciplines, ranging from the humanities to social sciences. 1 Program Levels in Liberal Arts: Associate degree, Bachelor's degree, Master's degree.</code> |
344
+ | <code>what are the liberal arts?</code> | <code>liberal arts. 1. the academic course of instruction at a college intended to provide general knowledge and comprising the arts, humanities, natural sciences, and social sciences, as opposed to professional or technical subjects.</code> | <code>Artes Liberales: The historical basis for the modern liberal arts, consisting of the trivium (grammar, logic, and rhetoric) and the quadrivium (arithmetic, geometry, astronomy, and music). General Education: That part of a liberal education curriculum that is shared by all students.</code> |
345
+ | <code>what are the liberal arts?</code> | <code>liberal arts. 1. the academic course of instruction at a college intended to provide general knowledge and comprising the arts, humanities, natural sciences, and social sciences, as opposed to professional or technical subjects.</code> | <code>Liberal Arts. Upon completion of the Liberal Arts degree, students will be able to express ideas in coherent, creative, and appropriate forms, orally and in writing. Students will be able to apply their reading abilities in order to interconnect an understanding of resources to academic, professional, and personal interests.</code> |
346
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
347
+ ```json
348
+ {
349
+ "scale": 20.0,
350
+ "similarity_fct": "cos_sim"
351
+ }
352
+ ```
353
+
354
  #### sentence-transformers/gooaq
355
 
356
  * Dataset: [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
357
+ * Size: 500,000 training samples
358
  * Columns: <code>question</code> and <code>answer</code>
359
  * Approximate statistics based on the first 1000 samples:
360
  | | question | answer |
 
405
  * Size: 425,243 training samples
406
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
407
  * Approximate statistics based on the first 1000 samples:
408
+ | | sentence1 | sentence2 | label |
409
+ |:--------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
410
+ | type | string | string | int |
411
+ | details | <ul><li>min: 6 tokens</li><li>mean: 72.83 tokens</li><li>max: 1219 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 12.78 tokens</li><li>max: 118 tokens</li></ul> | <ul><li>0: ~55.50%</li><li>1: ~44.50%</li></ul> |
412
  * Samples:
413
+ | sentence1 | sentence2 | label |
414
+ |:---------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
415
+ | <code>What type of food was cheese considered to be in Rome?</code> | <code>The staple foods were generally consumed around 11 o'clock, and consisted of bread, lettuce, cheese, fruits, nuts, and cold meat left over from the dinner the night before.[citation needed]</code> | <code>1</code> |
416
+ | <code>No Weapons of Mass Destruction Found in Iraq Yet.</code> | <code>Weapons of Mass Destruction Found in Iraq.</code> | <code>0</code> |
417
+ | <code>I stuck a pin through a carrot. When I pulled the pin out, it had a hole.</code> | <code>The carrot had a hole.</code> | <code>1</code> |
418
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
419
 
420
  #### merged-3l-nli
 
426
  | | sentence1 | sentence2 | label |
427
  |:--------|:-------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
428
  | type | string | string | int |
429
+ | details | <ul><li>min: 5 tokens</li><li>mean: 114.98 tokens</li><li>max: 2048 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 28.37 tokens</li><li>max: 570 tokens</li></ul> | <ul><li>0: ~36.00%</li><li>1: ~31.50%</li><li>2: ~32.50%</li></ul> |
430
  * Samples:
431
+ | sentence1 | sentence2 | label |
432
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------|:---------------|
433
+ | <code>Over the nave, the two hollow pyramids appear to be designed in the style of chimneys for a castle kitchen.</code> | <code>There are seven pyramids there.</code> | <code>2</code> |
434
+ | <code>The Catch of the Season is an Edwardian musical comedy by Seymour Hicks and Cosmo Hamilton, with music by Herbert Haines and Evelyn Baker and lyrics by Charles H. Taylor, based on the fairy tale Cinderella. A debutante is engaged to a young aristocrat but loves a page.</code> | <code>Seymour Hicks was alive in 1975.</code> | <code>1</code> |
435
+ | <code>A 3600 g infant is heavy. A 2400 g infant is light.</code> | <code>A 2220 g bicycle is light.</code> | <code>1</code> |
436
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
437
 
438
  #### zero-shot-label-nli
 
444
  | | label | sentence1 | sentence2 |
445
  |:--------|:------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
446
  | type | int | string | string |
447
+ | details | <ul><li>0: ~49.30%</li><li>2: ~50.70%</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 77.36 tokens</li><li>max: 2048 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 8.08 tokens</li><li>max: 17 tokens</li></ul> |
448
  * Samples:
449
+ | label | sentence1 | sentence2 |
450
+ |:---------------|:-----------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------|
451
+ | <code>2</code> | <code>okay</code> | <code>This example is reply_y.</code> |
452
+ | <code>2</code> | <code>We retrospectively compared 2 methods that have been proposed to screen for IA [1, 2].</code> | <code>This example is background.</code> |
453
+ | <code>2</code> | <code>PersonX puts it under PersonX's pillow PersonX then checks it again<br>Person X suffers from obsessive compulsive disorder.</code> | <code>This example is weakener.</code> |
454
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
455
 
456
  #### dataset_train_nli
 
513
  * Size: 363,846 training samples
514
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
515
  * Approximate statistics based on the first 1000 samples:
516
+ | | sentence1 | sentence2 | label |
517
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
518
+ | type | string | string | int |
519
+ | details | <ul><li>min: 6 tokens</li><li>mean: 15.9 tokens</li><li>max: 53 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.73 tokens</li><li>max: 72 tokens</li></ul> | <ul><li>0: ~61.90%</li><li>1: ~38.10%</li></ul> |
520
  * Samples:
521
+ | sentence1 | sentence2 | label |
522
+ |:------------------------------------------------------|:---------------------------------------------------------|:---------------|
523
+ | <code>What are reviews of Big Data University?</code> | <code>What is your review of Big Data University?</code> | <code>1</code> |
524
+ | <code>What are glass bottles made of?</code> | <code>How is a glass bottle made?</code> | <code>0</code> |
525
+ | <code>What do you really know about Algeria?</code> | <code>What do you know about Algeria?</code> | <code>1</code> |
526
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
527
 
528
  #### fever-evidence-related
 
531
  * Size: 403,218 training samples
532
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
533
  * Approximate statistics based on the first 1000 samples:
534
+ | | sentence1 | sentence2 | label |
535
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:------------------------------------------------|
536
+ | type | string | string | int |
537
+ | details | <ul><li>min: 6 tokens</li><li>mean: 13.63 tokens</li><li>max: 39 tokens</li></ul> | <ul><li>min: 9 tokens</li><li>mean: 350.02 tokens</li><li>max: 2048 tokens</li></ul> | <ul><li>0: ~31.80%</li><li>1: ~68.20%</li></ul> |
538
  * Samples:
539
+ | sentence1 | sentence2 | label |
540
+ |:--------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
541
+ | <code>The Bridges of Madison County is a TV series.</code> | <code>Saulsbury is a town in Hardeman County , Tennessee .. Hardeman County. Hardeman County, Tennessee. Tennessee. Tennessee. County. List of counties in Tennessee. Hardeman. Hardeman County, Tennessee. The population was 99 at the 2000 census and 81 at the 2010 census showing a decrease of 18 .. United States Census, 2010. It is located along State Highway 57 in southwest Hardeman County .. Hardeman County. Hardeman County, Tennessee. State. Political divisions of the United States. County. List of counties in Tennessee. Hardeman. Hardeman County, Tennessee. State Highway 57. State Highway 57</code> | <code>1</code> |
542
+ | <code>Jessica Lange's first film role was in Godzilla.</code> | <code>Haji Ahmadov -LRB- Hacı Əhmədov , born on 23 November 1993 in Baku , Soviet Union -RRB- is an Azerbaijani football defender who plays for AZAL .. Baku. Baku. AZAL. AZAL PFK. Soviet Union. Soviet Union. Azerbaijani. Azerbaijani people. football. football ( soccer ). defender. Defender ( football )</code> | <code>1</code> |
543
+ | <code>Brad Pitt directed 12 Years a Slave.</code> | <code>The Bronze Bauhinia Star -LRB- , BBS -RRB- is the lowest rank in Order of the Bauhinia Star in Hong Kong , created in 1997 to replace the British honours system of the Order of the British Empire after the transfer of sovereignty to People 's Republic of China and the establishment of the Hong Kong Special Administrative Region -LRB- HKSAR -RRB- .. Order of the Bauhinia Star. Order of the Bauhinia Star. British honours system. British honours system. Order of the British Empire. Order of the British Empire. Special Administrative Region. Special Administrative Region of the People's Republic of China. It is awarded to persons who have given outstanding service over a long period of time , but in a more limited field or way than that required for the Silver Bauhinia Star .. Silver Bauhinia Star. Silver Bauhinia Star</code> | <code>1</code> |
544
  * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
545
 
546
  #### glue/stsb
 
549
  * Size: 5,749 training samples
550
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
551
  * Approximate statistics based on the first 1000 samples:
552
+ | | sentence1 | sentence2 | label |
553
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
554
+ | type | string | string | float |
555
+ | details | <ul><li>min: 6 tokens</li><li>mean: 15.22 tokens</li><li>max: 74 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.04 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 2.74</li><li>max: 5.0</li></ul> |
556
  * Samples:
557
+ | sentence1 | sentence2 | label |
558
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------|
559
+ | <code>Snowden Hits Hurdles in Search for Asylum</code> | <code>Snowden's hits hurdles in search for asylum</code> | <code>5.0</code> |
560
+ | <code>Ukrainian protesters back in streets for anti-government rally</code> | <code>Ukraine protesters topple Lenin statue in Kiev</code> | <code>2.5999999046325684</code> |
561
+ | <code>"Biotech products, if anything, may be safer than conventional products because of all the testing," Fraley said, adding that 18 countries have adopted biotechnology.</code> | <code>"Biotech products, if anything, may be safer than conventional products because of all the testing," said Robert Fraley, Monsanto's executive vice president.</code> | <code>3.200000047683716</code> |
562
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
563
  ```json
564
  {
 
572
  * Size: 4,439 training samples
573
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
574
  * Approximate statistics based on the first 1000 samples:
575
+ | | sentence1 | sentence2 | label |
576
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
577
+ | type | string | string | float |
578
+ | details | <ul><li>min: 6 tokens</li><li>mean: 12.17 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 12.06 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 3.53</li><li>max: 5.0</li></ul> |
579
  * Samples:
580
+ | sentence1 | sentence2 | label |
581
+ |:-----------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------|:-------------------------------|
582
+ | <code>The dark skinned male is standing on one hand in front of a yellow building</code> | <code>The dark skinned male is not standing on one hand in front of a yellow building</code> | <code>4.0</code> |
583
+ | <code>A man is singing and playing a guitar</code> | <code>A boy is skillfully playing a piano</code> | <code>2.299999952316284</code> |
584
+ | <code>A picture is being drawn by a man</code> | <code>The person is drawing</code> | <code>4.099999904632568</code> |
585
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
586
  ```json
587
  {
 
598
  | | label | sentence1 | sentence2 |
599
  |:--------|:---------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
600
  | type | float | string | string |
601
+ | details | <ul><li>min: 0.0</li><li>mean: 3.15</li><li>max: 5.0</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 18.78 tokens</li><li>max: 79 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 16.68 tokens</li><li>max: 71 tokens</li></ul> |
602
  * Samples:
603
+ | label | sentence1 | sentence2 |
604
+ |:-----------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
605
+ | <code>4.6</code> | <code>As a matter of urgency, therefore, the staff complement of the Interdepartmental Group attached to the Commission Secretariat should be strengthened at the earliest possible opportunity in order to ensure that all proposals for acts which are general in scope are accompanied, when considered by the College of Commissioners and on the basis of Article 299(2), by a simplified sheet outlining their potential impact.</code> | <code>Thus, it is urgent that the inter-service group staff should be strengthened very quickly at the heart of the General Secretariat of the Commission, so that all proposals to act of general scope can be accompanied, during their examination by the college on the basis of Article 299(2), a detailed impact statement.</code> |
606
+ | <code>4.0</code> | <code>Reiterating the calls made by the European Parliament in its resolution of 16 March 2000, what initiatives does the Presidency of the European Council propose to take with a view to playing a more active role so as to guarantee the full and complete application of the UN peace plan?</code> | <code>As requested by the European Parliament in its resolution of 16 March 2000, that these initiatives the presidency of the European Council is going to take to play a more active role in order to ensure the full implementation of the UN peace plan?</code> |
607
+ | <code>3.2</code> | <code>Let us, as a Europe of 15 Member States, organise ourselves in order to be able to welcome those countries who are knocking at the door into the fold under respectable conditions.</code> | <code>Let us organise itself to 15 in order to be able to welcome the right conditions for countries which are knocking on our door.</code> |
608
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
609
  ```json
610
  {
 
831
  #### Non-Default Hyperparameters
832
 
833
  - `per_device_train_batch_size`: 24
834
+ - `learning_rate`: 3.5e-05
835
  - `weight_decay`: 1e-06
836
  - `num_train_epochs`: 1
837
  - `warmup_ratio`: 0.1
 
851
  - `gradient_accumulation_steps`: 1
852
  - `eval_accumulation_steps`: None
853
  - `torch_empty_cache_steps`: None
854
+ - `learning_rate`: 3.5e-05
855
  - `weight_decay`: 1e-06
856
  - `adam_beta1`: 0.9
857
  - `adam_beta2`: 0.999
 
959
  </details>
960
 
961
  ### Training Logs
962
+ <details><summary>Click to expand</summary>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
963
 
964
+ | Epoch | Step | Training Loss |
965
+ |:------:|:------:|:-------------:|
966
+ | 0.0028 | 500 | 5.7412 |
967
+ | 0.0055 | 1000 | 2.2293 |
968
+ | 0.0083 | 1500 | 1.1572 |
969
+ | 0.0111 | 2000 | 0.9386 |
970
+ | 0.0138 | 2500 | 0.8352 |
971
+ | 0.0166 | 3000 | 0.7291 |
972
+ | 0.0194 | 3500 | 0.6555 |
973
+ | 0.0222 | 4000 | 0.6488 |
974
+ | 0.0249 | 4500 | 0.6267 |
975
+ | 0.0277 | 5000 | 0.5527 |
976
+ | 0.0305 | 5500 | 0.5985 |
977
+ | 0.0332 | 6000 | 0.5574 |
978
+ | 0.0360 | 6500 | 0.5642 |
979
+ | 0.0388 | 7000 | 0.5821 |
980
+ | 0.0415 | 7500 | 0.5289 |
981
+ | 0.0443 | 8000 | 0.5374 |
982
+ | 0.0471 | 8500 | 0.5187 |
983
+ | 0.0499 | 9000 | 0.5278 |
984
+ | 0.0526 | 9500 | 0.4983 |
985
+ | 0.0554 | 10000 | 0.4758 |
986
+ | 0.0582 | 10500 | 0.4939 |
987
+ | 0.0609 | 11000 | 0.4944 |
988
+ | 0.0637 | 11500 | 0.4967 |
989
+ | 0.0665 | 12000 | 0.4543 |
990
+ | 0.0692 | 12500 | 0.4649 |
991
+ | 0.0720 | 13000 | 0.4612 |
992
+ | 0.0748 | 13500 | 0.4612 |
993
+ | 0.0776 | 14000 | 0.4684 |
994
+ | 0.0803 | 14500 | 0.4904 |
995
+ | 0.0831 | 15000 | 0.4538 |
996
+ | 0.0859 | 15500 | 0.4388 |
997
+ | 0.0886 | 16000 | 0.4584 |
998
+ | 0.0914 | 16500 | 0.4728 |
999
+ | 0.0942 | 17000 | 0.4236 |
1000
+ | 0.0969 | 17500 | 0.4328 |
1001
+ | 0.0997 | 18000 | 0.4624 |
1002
+ | 0.1025 | 18500 | 0.4732 |
1003
+ | 0.1053 | 19000 | 0.4375 |
1004
+ | 0.1080 | 19500 | 0.4495 |
1005
+ | 0.1108 | 20000 | 0.4296 |
1006
+ | 0.1136 | 20500 | 0.4211 |
1007
+ | 0.1163 | 21000 | 0.4399 |
1008
+ | 0.1191 | 21500 | 0.4353 |
1009
+ | 0.1219 | 22000 | 0.4407 |
1010
+ | 0.1246 | 22500 | 0.3892 |
1011
+ | 0.1274 | 23000 | 0.4121 |
1012
+ | 0.1302 | 23500 | 0.4253 |
1013
+ | 0.1330 | 24000 | 0.4066 |
1014
+ | 0.1357 | 24500 | 0.4168 |
1015
+ | 0.1385 | 25000 | 0.3921 |
1016
+ | 0.1413 | 25500 | 0.4008 |
1017
+ | 0.1440 | 26000 | 0.4164 |
1018
+ | 0.1468 | 26500 | 0.4047 |
1019
+ | 0.1496 | 27000 | 0.4031 |
1020
+ | 0.1523 | 27500 | 0.3955 |
1021
+ | 0.1551 | 28000 | 0.3809 |
1022
+ | 0.1579 | 28500 | 0.3992 |
1023
+ | 0.1606 | 29000 | 0.3686 |
1024
+ | 0.1634 | 29500 | 0.3851 |
1025
+ | 0.1662 | 30000 | 0.3776 |
1026
+ | 0.1690 | 30500 | 0.3919 |
1027
+ | 0.1717 | 31000 | 0.4026 |
1028
+ | 0.1745 | 31500 | 0.38 |
1029
+ | 0.1773 | 32000 | 0.41 |
1030
+ | 0.1800 | 32500 | 0.3731 |
1031
+ | 0.1828 | 33000 | 0.3831 |
1032
+ | 0.1856 | 33500 | 0.3727 |
1033
+ | 0.1883 | 34000 | 0.3664 |
1034
+ | 0.1911 | 34500 | 0.3882 |
1035
+ | 0.1939 | 35000 | 0.3873 |
1036
+ | 0.1967 | 35500 | 0.3529 |
1037
+ | 0.1994 | 36000 | 0.3923 |
1038
+ | 0.2022 | 36500 | 0.4051 |
1039
+ | 0.2050 | 37000 | 0.4134 |
1040
+ | 0.2077 | 37500 | 0.3478 |
1041
+ | 0.2105 | 38000 | 0.3602 |
1042
+ | 0.2133 | 38500 | 0.3547 |
1043
+ | 0.2160 | 39000 | 0.3748 |
1044
+ | 0.2188 | 39500 | 0.3537 |
1045
+ | 0.2216 | 40000 | 0.38 |
1046
+ | 0.2244 | 40500 | 0.3731 |
1047
+ | 0.2271 | 41000 | 0.3537 |
1048
+ | 0.2299 | 41500 | 0.3576 |
1049
+ | 0.2327 | 42000 | 0.3626 |
1050
+ | 0.2354 | 42500 | 0.3587 |
1051
+ | 0.2382 | 43000 | 0.3488 |
1052
+ | 0.2410 | 43500 | 0.3694 |
1053
+ | 0.2437 | 44000 | 0.3508 |
1054
+ | 0.2465 | 44500 | 0.3634 |
1055
+ | 0.2493 | 45000 | 0.3608 |
1056
+ | 0.2521 | 45500 | 0.4007 |
1057
+ | 0.2548 | 46000 | 0.3559 |
1058
+ | 0.2576 | 46500 | 0.3317 |
1059
+ | 0.2604 | 47000 | 0.3518 |
1060
+ | 0.2631 | 47500 | 0.3578 |
1061
+ | 0.2659 | 48000 | 0.3375 |
1062
+ | 0.2687 | 48500 | 0.3229 |
1063
+ | 0.2714 | 49000 | 0.3319 |
1064
+ | 0.2742 | 49500 | 0.3656 |
1065
+ | 0.2770 | 50000 | 0.3598 |
1066
+ | 0.2798 | 50500 | 0.3705 |
1067
+ | 0.2825 | 51000 | 0.3431 |
1068
+ | 0.2853 | 51500 | 0.3587 |
1069
+ | 0.2881 | 52000 | 0.3361 |
1070
+ | 0.2908 | 52500 | 0.3734 |
1071
+ | 0.2936 | 53000 | 0.3361 |
1072
+ | 0.2964 | 53500 | 0.3322 |
1073
+ | 0.2991 | 54000 | 0.347 |
1074
+ | 0.3019 | 54500 | 0.3617 |
1075
+ | 0.3047 | 55000 | 0.3318 |
1076
+ | 0.3074 | 55500 | 0.3401 |
1077
+ | 0.3102 | 56000 | 0.328 |
1078
+ | 0.3130 | 56500 | 0.3553 |
1079
+ | 0.3158 | 57000 | 0.3669 |
1080
+ | 0.3185 | 57500 | 0.4088 |
1081
+ | 0.3213 | 58000 | 0.3636 |
1082
+ | 0.3241 | 58500 | 0.3372 |
1083
+ | 0.3268 | 59000 | 0.3494 |
1084
+ | 0.3296 | 59500 | 0.3504 |
1085
+ | 0.3324 | 60000 | 0.3389 |
1086
+ | 0.3351 | 60500 | 0.3219 |
1087
+ | 0.3379 | 61000 | 0.3283 |
1088
+ | 0.3407 | 61500 | 0.3202 |
1089
+ | 0.3435 | 62000 | 0.3185 |
1090
+ | 0.3462 | 62500 | 0.3449 |
1091
+ | 0.3490 | 63000 | 0.3527 |
1092
+ | 0.3518 | 63500 | 0.3349 |
1093
+ | 0.3545 | 64000 | 0.3225 |
1094
+ | 0.3573 | 64500 | 0.3269 |
1095
+ | 0.3601 | 65000 | 0.3074 |
1096
+ | 0.3628 | 65500 | 0.3513 |
1097
+ | 0.3656 | 66000 | 0.3166 |
1098
+ | 0.3684 | 66500 | 0.3472 |
1099
+ | 0.3712 | 67000 | 0.3395 |
1100
+ | 0.3739 | 67500 | 0.3437 |
1101
+ | 0.3767 | 68000 | 0.3491 |
1102
+ | 0.3795 | 68500 | 0.3181 |
1103
+ | 0.3822 | 69000 | 0.3324 |
1104
+ | 0.3850 | 69500 | 0.3335 |
1105
+ | 0.3878 | 70000 | 0.3401 |
1106
+ | 0.3905 | 70500 | 0.3433 |
1107
+ | 0.3933 | 71000 | 0.3229 |
1108
+ | 0.3961 | 71500 | 0.3264 |
1109
+ | 0.3989 | 72000 | 0.3123 |
1110
+ | 0.4016 | 72500 | 0.3207 |
1111
+ | 0.4044 | 73000 | 0.3008 |
1112
+ | 0.4072 | 73500 | 0.2998 |
1113
+ | 0.4099 | 74000 | 0.2992 |
1114
+ | 0.4127 | 74500 | 0.3134 |
1115
+ | 0.4155 | 75000 | 0.3262 |
1116
+ | 0.4182 | 75500 | 0.2988 |
1117
+ | 0.4210 | 76000 | 0.2936 |
1118
+ | 0.4238 | 76500 | 0.314 |
1119
+ | 0.4266 | 77000 | 0.3083 |
1120
+ | 0.4293 | 77500 | 0.3103 |
1121
+ | 0.4321 | 78000 | 0.3303 |
1122
+ | 0.4349 | 78500 | 0.3282 |
1123
+ | 0.4376 | 79000 | 0.3415 |
1124
+ | 0.4404 | 79500 | 0.3001 |
1125
+ | 0.4432 | 80000 | 0.321 |
1126
+ | 0.4459 | 80500 | 0.3219 |
1127
+ | 0.4487 | 81000 | 0.3477 |
1128
+ | 0.4515 | 81500 | 0.2871 |
1129
+ | 0.4542 | 82000 | 0.2913 |
1130
+ | 0.4570 | 82500 | 0.3121 |
1131
+ | 0.4598 | 83000 | 0.3057 |
1132
+ | 0.4626 | 83500 | 0.32 |
1133
+ | 0.4653 | 84000 | 0.3086 |
1134
+ | 0.4681 | 84500 | 0.3091 |
1135
+ | 0.4709 | 85000 | 0.3243 |
1136
+ | 0.4736 | 85500 | 0.3104 |
1137
+ | 0.4764 | 86000 | 0.3124 |
1138
+ | 0.4792 | 86500 | 0.3134 |
1139
+ | 0.4819 | 87000 | 0.2967 |
1140
+ | 0.4847 | 87500 | 0.3036 |
1141
+ | 0.4875 | 88000 | 0.3079 |
1142
+ | 0.4903 | 88500 | 0.2959 |
1143
+ | 0.4930 | 89000 | 0.3332 |
1144
+ | 0.4958 | 89500 | 0.3151 |
1145
+ | 0.4986 | 90000 | 0.3233 |
1146
+ | 0.5013 | 90500 | 0.3083 |
1147
+ | 0.5041 | 91000 | 0.2913 |
1148
+ | 0.5069 | 91500 | 0.31 |
1149
+ | 0.5096 | 92000 | 0.2962 |
1150
+ | 0.5124 | 92500 | 0.3254 |
1151
+ | 0.5152 | 93000 | 0.312 |
1152
+ | 0.5180 | 93500 | 0.3152 |
1153
+ | 0.5207 | 94000 | 0.3208 |
1154
+ | 0.5235 | 94500 | 0.3039 |
1155
+ | 0.5263 | 95000 | 0.3187 |
1156
+ | 0.5290 | 95500 | 0.3052 |
1157
+ | 0.5318 | 96000 | 0.3114 |
1158
+ | 0.5346 | 96500 | 0.315 |
1159
+ | 0.5373 | 97000 | 0.2862 |
1160
+ | 0.5401 | 97500 | 0.3104 |
1161
+ | 0.5429 | 98000 | 0.3 |
1162
+ | 0.5457 | 98500 | 0.3017 |
1163
+ | 0.5484 | 99000 | 0.3189 |
1164
+ | 0.5512 | 99500 | 0.2919 |
1165
+ | 0.5540 | 100000 | 0.2913 |
1166
+ | 0.5567 | 100500 | 0.2936 |
1167
+ | 0.5595 | 101000 | 0.3044 |
1168
+ | 0.5623 | 101500 | 0.3034 |
1169
+ | 0.5650 | 102000 | 0.2999 |
1170
+ | 0.5678 | 102500 | 0.2961 |
1171
+ | 0.5706 | 103000 | 0.328 |
1172
+ | 0.5734 | 103500 | 0.3061 |
1173
+ | 0.5761 | 104000 | 0.295 |
1174
+ | 0.5789 | 104500 | 0.2997 |
1175
+ | 0.5817 | 105000 | 0.2981 |
1176
+ | 0.5844 | 105500 | 0.2966 |
1177
+ | 0.5872 | 106000 | 0.2798 |
1178
+ | 0.5900 | 106500 | 0.3001 |
1179
+ | 0.5927 | 107000 | 0.3018 |
1180
+ | 0.5955 | 107500 | 0.3076 |
1181
+ | 0.5983 | 108000 | 0.3093 |
1182
+ | 0.6010 | 108500 | 0.3096 |
1183
+ | 0.6038 | 109000 | 0.2914 |
1184
+ | 0.6066 | 109500 | 0.2874 |
1185
+ | 0.6094 | 110000 | 0.2777 |
1186
+ | 0.6121 | 110500 | 0.2854 |
1187
+ | 0.6149 | 111000 | 0.3279 |
1188
+ | 0.6177 | 111500 | 0.2843 |
1189
+ | 0.6204 | 112000 | 0.2956 |
1190
+ | 0.6232 | 112500 | 0.3076 |
1191
+ | 0.6260 | 113000 | 0.314 |
1192
+ | 0.6287 | 113500 | 0.295 |
1193
+ | 0.6315 | 114000 | 0.2914 |
1194
+ | 0.6343 | 114500 | 0.3041 |
1195
+ | 0.6371 | 115000 | 0.2871 |
1196
+ | 0.6398 | 115500 | 0.3004 |
1197
+ | 0.6426 | 116000 | 0.2954 |
1198
+ | 0.6454 | 116500 | 0.2959 |
1199
+ | 0.6481 | 117000 | 0.3214 |
1200
+ | 0.6509 | 117500 | 0.2828 |
1201
+ | 0.6537 | 118000 | 0.3005 |
1202
+ | 0.6564 | 118500 | 0.2918 |
1203
+ | 0.6592 | 119000 | 0.2988 |
1204
+ | 0.6620 | 119500 | 0.2901 |
1205
+ | 0.6648 | 120000 | 0.2796 |
1206
+ | 0.6675 | 120500 | 0.2988 |
1207
+ | 0.6703 | 121000 | 0.2969 |
1208
+ | 0.6731 | 121500 | 0.2892 |
1209
+ | 0.6758 | 122000 | 0.2812 |
1210
+ | 0.6786 | 122500 | 0.2992 |
1211
+ | 0.6814 | 123000 | 0.2691 |
1212
+ | 0.6841 | 123500 | 0.2966 |
1213
+ | 0.6869 | 124000 | 0.2906 |
1214
+ | 0.6897 | 124500 | 0.2807 |
1215
+ | 0.6925 | 125000 | 0.2684 |
1216
+ | 0.6952 | 125500 | 0.2771 |
1217
+ | 0.6980 | 126000 | 0.2992 |
1218
+ | 0.7008 | 126500 | 0.274 |
1219
+ | 0.7035 | 127000 | 0.2846 |
1220
+ | 0.7063 | 127500 | 0.2898 |
1221
+ | 0.7091 | 128000 | 0.2795 |
1222
+ | 0.7118 | 128500 | 0.2758 |
1223
+ | 0.7146 | 129000 | 0.2883 |
1224
+ | 0.7174 | 129500 | 0.2968 |
1225
+ | 0.7201 | 130000 | 0.2756 |
1226
+ | 0.7229 | 130500 | 0.3116 |
1227
+ | 0.7257 | 131000 | 0.2923 |
1228
+ | 0.7285 | 131500 | 0.2758 |
1229
+ | 0.7312 | 132000 | 0.262 |
1230
+ | 0.7340 | 132500 | 0.283 |
1231
+ | 0.7368 | 133000 | 0.2937 |
1232
+ | 0.7395 | 133500 | 0.2891 |
1233
+ | 0.7423 | 134000 | 0.2743 |
1234
+ | 0.7451 | 134500 | 0.3087 |
1235
+ | 0.7478 | 135000 | 0.2855 |
1236
+ | 0.7506 | 135500 | 0.2902 |
1237
+ | 0.7534 | 136000 | 0.278 |
1238
+ | 0.7562 | 136500 | 0.2607 |
1239
+ | 0.7589 | 137000 | 0.2634 |
1240
+ | 0.7617 | 137500 | 0.2807 |
1241
+ | 0.7645 | 138000 | 0.294 |
1242
+ | 0.7672 | 138500 | 0.2837 |
1243
+ | 0.7700 | 139000 | 0.2521 |
1244
+ | 0.7728 | 139500 | 0.2751 |
1245
+ | 0.7755 | 140000 | 0.3012 |
1246
+ | 0.7783 | 140500 | 0.2816 |
1247
+ | 0.7811 | 141000 | 0.2756 |
1248
+ | 0.7839 | 141500 | 0.2661 |
1249
+ | 0.7866 | 142000 | 0.2585 |
1250
+ | 0.7894 | 142500 | 0.2718 |
1251
+ | 0.7922 | 143000 | 0.2724 |
1252
+ | 0.7949 | 143500 | 0.2804 |
1253
+ | 0.7977 | 144000 | 0.2582 |
1254
+ | 0.8005 | 144500 | 0.2636 |
1255
+ | 0.8032 | 145000 | 0.2536 |
1256
+ | 0.8060 | 145500 | 0.2862 |
1257
+ | 0.8088 | 146000 | 0.2842 |
1258
+ | 0.8116 | 146500 | 0.2702 |
1259
+ | 0.8143 | 147000 | 0.2727 |
1260
+ | 0.8171 | 147500 | 0.2591 |
1261
+ | 0.8199 | 148000 | 0.2709 |
1262
+ | 0.8226 | 148500 | 0.2879 |
1263
+ | 0.8254 | 149000 | 0.2669 |
1264
+ | 0.8282 | 149500 | 0.2748 |
1265
+ | 0.8309 | 150000 | 0.2689 |
1266
+ | 0.8337 | 150500 | 0.2414 |
1267
+ | 0.8365 | 151000 | 0.261 |
1268
+ | 0.8393 | 151500 | 0.2967 |
1269
+ | 0.8420 | 152000 | 0.2757 |
1270
+ | 0.8448 | 152500 | 0.2667 |
1271
+ | 0.8476 | 153000 | 0.252 |
1272
+ | 0.8503 | 153500 | 0.2659 |
1273
+ | 0.8531 | 154000 | 0.2799 |
1274
+ | 0.8559 | 154500 | 0.2653 |
1275
+ | 0.8586 | 155000 | 0.275 |
1276
+ | 0.8614 | 155500 | 0.3067 |
1277
+ | 0.8642 | 156000 | 0.2742 |
1278
+ | 0.8669 | 156500 | 0.2616 |
1279
+ | 0.8697 | 157000 | 0.2793 |
1280
+ | 0.8725 | 157500 | 0.2721 |
1281
+ | 0.8753 | 158000 | 0.2623 |
1282
+ | 0.8780 | 158500 | 0.2801 |
1283
+ | 0.8808 | 159000 | 0.2499 |
1284
+ | 0.8836 | 159500 | 0.283 |
1285
+ | 0.8863 | 160000 | 0.2641 |
1286
+ | 0.8891 | 160500 | 0.2642 |
1287
+ | 0.8919 | 161000 | 0.271 |
1288
+ | 0.8946 | 161500 | 0.2624 |
1289
+ | 0.8974 | 162000 | 0.2721 |
1290
+ | 0.9002 | 162500 | 0.2698 |
1291
+ | 0.9030 | 163000 | 0.2519 |
1292
+ | 0.9057 | 163500 | 0.2771 |
1293
+ | 0.9085 | 164000 | 0.2719 |
1294
+ | 0.9113 | 164500 | 0.2747 |
1295
+ | 0.9140 | 165000 | 0.28 |
1296
+ | 0.9168 | 165500 | 0.2618 |
1297
+ | 0.9196 | 166000 | 0.2755 |
1298
+ | 0.9223 | 166500 | 0.3104 |
1299
+ | 0.9251 | 167000 | 0.2671 |
1300
+ | 0.9279 | 167500 | 0.2491 |
1301
+ | 0.9307 | 168000 | 0.262 |
1302
+ | 0.9334 | 168500 | 0.2514 |
1303
+ | 0.9362 | 169000 | 0.2632 |
1304
+ | 0.9390 | 169500 | 0.2834 |
1305
+ | 0.9417 | 170000 | 0.2573 |
1306
+ | 0.9445 | 170500 | 0.2662 |
1307
+ | 0.9473 | 171000 | 0.2631 |
1308
+ | 0.9500 | 171500 | 0.2507 |
1309
+ | 0.9528 | 172000 | 0.2739 |
1310
+ | 0.9556 | 172500 | 0.2567 |
1311
+ | 0.9584 | 173000 | 0.2489 |
1312
+ | 0.9611 | 173500 | 0.2607 |
1313
+ | 0.9639 | 174000 | 0.2627 |
1314
+ | 0.9667 | 174500 | 0.2715 |
1315
+ | 0.9694 | 175000 | 0.2603 |
1316
+ | 0.9722 | 175500 | 0.2533 |
1317
+ | 0.9750 | 176000 | 0.261 |
1318
+ | 0.9777 | 176500 | 0.2485 |
1319
+ | 0.9805 | 177000 | 0.2719 |
1320
+ | 0.9833 | 177500 | 0.2693 |
1321
+ | 0.9861 | 178000 | 0.2825 |
1322
+ | 0.9888 | 178500 | 0.2697 |
1323
+ | 0.9916 | 179000 | 0.2601 |
1324
+ | 0.9944 | 179500 | 0.2459 |
1325
+ | 0.9971 | 180000 | 0.2674 |
1326
+ | 0.9999 | 180500 | 0.2725 |
1327
+
1328
+ </details>
1329
 
1330
  ### Framework Versions
1331
  - Python: 3.11.4
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:402f1e7bb1b384e81e0fd638da38a601bb5ae4f7eecf87cbe60fa0ec5ddc8886
3
  size 596070136
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f3e4eaefdf2c3a2062d343e925bad3c10166870ec2854b3733a6381b7f465e8
3
  size 596070136