Sentence Similarity
sentence-transformers
Safetensors
English
modernbert
feature-extraction
Generated from Trainer
dataset_size:6661966
loss:MultipleNegativesRankingLoss
loss:CachedMultipleNegativesRankingLoss
loss:SoftmaxLoss
loss:AnglELoss
loss:CoSENTLoss
loss:CosineSimilarityLoss
Inference Endpoints
Add new SentenceTransformer model
Browse files- README.md +560 -163
- model.safetensors +1 -1
README.md
CHANGED
@@ -6,96 +6,121 @@ tags:
|
|
6 |
- sentence-similarity
|
7 |
- feature-extraction
|
8 |
- generated_from_trainer
|
9 |
-
- dataset_size:
|
10 |
- loss:MultipleNegativesRankingLoss
|
11 |
- loss:CachedMultipleNegativesRankingLoss
|
12 |
- loss:SoftmaxLoss
|
13 |
- loss:CosineSimilarityLoss
|
14 |
base_model: tasksource/ModernBERT-base-nli
|
15 |
widget:
|
16 |
-
- source_sentence:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
sentences:
|
18 |
-
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
played over two legs and were contested by the teams who finished in 2nd , 3rd
|
23 |
-
, 4th and 5th place in the Football League First Division and Football League
|
24 |
-
Second Division and the 3rd , 4th , 5th , and 6th placed teams in the Football
|
25 |
-
League Third Division table .. Football League First Division. 1994–95 Football
|
26 |
-
League First Division. Football League Second Division. 1994–95 Football League
|
27 |
-
Second Division. Football League Third Division. 1994–95 Football League Third
|
28 |
-
Division. The winners of the semi-finals progressed through to the finals , with
|
29 |
-
the winner of these matches gaining promotion for the following season .. following
|
30 |
-
season. 1995-96 in English football
|
31 |
-
- Sir Alexander Mackenzie Elementary is a public elementary school in Vancouver
|
32 |
-
, British Columbia part of School District 39 Vancouver .. Vancouver. Vancouver,
|
33 |
-
British Columbia. British Columbia. British Columbia. School District 39 Vancouver.
|
34 |
-
School District 39 Vancouver. elementary school. elementary school
|
35 |
-
- 'Help Wanted -LRB- Hataraku Hito : Hard Working People in Japan , Job Island :
|
36 |
-
Hard Working People in Europe -RRB- is a game that features a collection of various
|
37 |
-
, Wii Remote-based minigames .. Wii. Wii. Wii Remote. Wii Remote. The game is
|
38 |
-
developed and published by Hudson Soft and was released in Japan for Nintendo
|
39 |
-
''s Wii on November 27 , 2008 , in Europe on March 13 , 2009 , in Australia on
|
40 |
-
March 27 , 2009 , and in North America on May 12 , 2009 .. Hudson Soft. Hudson
|
41 |
-
Soft. Wii. Wii. Nintendo. Nintendo'
|
42 |
-
- source_sentence: The researchers asked children of different ages to use words to
|
43 |
-
form semantic correspondence. For example, when children see the words eagle,
|
44 |
-
bear and robin, they combine them best according to their meaning. The results
|
45 |
-
showed that older participants were more likely to develop different types of
|
46 |
-
false memory than younger participants. Because there are many forms of classification
|
47 |
-
in their minds. For example, young children classify eagles and robins as birds,
|
48 |
-
while older children classify eagles and bears as predators. Compared with children,
|
49 |
-
they have a concept of predators in their minds.
|
50 |
-
sentences:
|
51 |
-
- Extractive Industries Transparency Initiative is an organization
|
52 |
-
- Mason heard a pun
|
53 |
-
- Older children are more likely to have false memories than younger ones conforms
|
54 |
-
to the context.
|
55 |
-
- source_sentence: 'Version 0.5 is released today. The biggest change is that this
|
56 |
-
version finally has upload progress.
|
57 |
-
|
58 |
-
Download it here:
|
59 |
-
|
60 |
-
Or go to for more information about this project.
|
61 |
-
|
62 |
-
Changelog:
|
63 |
-
|
64 |
-
* Refactored the authentication_controller
|
65 |
-
|
66 |
-
* Put before_filter :authorize in ApplicationController (and using skip_before_filter
|
67 |
-
in other controllers if necessary)
|
68 |
|
69 |
-
|
70 |
|
71 |
-
|
72 |
|
73 |
-
|
74 |
|
75 |
-
|
76 |
-
|
77 |
-
|
|
|
|
|
|
|
|
|
78 |
sentences:
|
79 |
-
- This example
|
80 |
-
-
|
81 |
-
-
|
82 |
-
- source_sentence:
|
83 |
-
3 to meet with @user supporters! #SemST'
|
84 |
sentences:
|
85 |
-
-
|
86 |
-
-
|
87 |
-
-
|
88 |
-
- source_sentence:
|
89 |
-
by medications or by diseases such as cancer, diabetes and AIDS.
|
90 |
sentences:
|
91 |
-
-
|
92 |
-
|
93 |
-
|
94 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
95 |
datasets:
|
96 |
- tomaarsen/natural-questions-hard-negatives
|
97 |
- tomaarsen/gooaq-hard-negatives
|
98 |
- bclavie/msmarco-500k-triplets
|
|
|
99 |
- sentence-transformers/gooaq
|
100 |
- sentence-transformers/natural-questions
|
101 |
- tasksource/merged-2l-nli
|
@@ -112,7 +137,7 @@ library_name: sentence-transformers
|
|
112 |
|
113 |
# SentenceTransformer based on tasksource/ModernBERT-base-nli
|
114 |
|
115 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [tasksource/ModernBERT-base-nli](https://huggingface.co/tasksource/ModernBERT-base-nli) on the [tomaarsen/natural-questions-hard-negatives](https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives), [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives), [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets), [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq), [sentence-transformers/natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions), [merged-2l-nli](https://huggingface.co/datasets/tasksource/merged-2l-nli), [merged-3l-nli](https://huggingface.co/datasets/tasksource/merged-3l-nli), [zero-shot-label-nli](https://huggingface.co/datasets/tasksource/zero-shot-label-nli), [dataset_train_nli](https://huggingface.co/datasets/MoritzLaurer/dataset_train_nli), [paws/labeled_final](https://huggingface.co/datasets/paws), [glue/mrpc](https://huggingface.co/datasets/glue), [glue/qqp](https://huggingface.co/datasets/glue), [fever-evidence-related](https://huggingface.co/datasets/mwong/fever-evidence-related), [glue/stsb](https://huggingface.co/datasets/glue), sick/relatedness and [sts-companion](https://huggingface.co/datasets/tasksource/sts-companion) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
116 |
|
117 |
## Model Details
|
118 |
|
@@ -126,6 +151,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [t
|
|
126 |
- [tomaarsen/natural-questions-hard-negatives](https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives)
|
127 |
- [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives)
|
128 |
- [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets)
|
|
|
129 |
- [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
|
130 |
- [sentence-transformers/natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions)
|
131 |
- [merged-2l-nli](https://huggingface.co/datasets/tasksource/merged-2l-nli)
|
@@ -175,9 +201,9 @@ from sentence_transformers import SentenceTransformer
|
|
175 |
model = SentenceTransformer("tasksource/ModernBERT-base-embed")
|
176 |
# Run inference
|
177 |
sentences = [
|
178 |
-
'
|
179 |
-
'
|
180 |
-
|
181 |
]
|
182 |
embeddings = model.encode(sentences)
|
183 |
print(embeddings.shape)
|
@@ -256,7 +282,7 @@ You can finetune this model on your own dataset.
|
|
256 |
#### tomaarsen/gooaq-hard-negatives
|
257 |
|
258 |
* Dataset: [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives) at [87594a1](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives/tree/87594a1e6c58e88b5843afa9da3a97ffd75d01c2)
|
259 |
-
* Size:
|
260 |
* Columns: <code>question</code>, <code>answer</code>, <code>negative_1</code>, <code>negative_2</code>, <code>negative_3</code>, <code>negative_4</code>, and <code>negative_5</code>
|
261 |
* Approximate statistics based on the first 1000 samples:
|
262 |
| | question | answer | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 |
|
@@ -280,7 +306,7 @@ You can finetune this model on your own dataset.
|
|
280 |
#### bclavie/msmarco-500k-triplets
|
281 |
|
282 |
* Dataset: [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets) at [cb1a85c](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets/tree/cb1a85c1261fa7c65f4ea43f94e50f8b467c372f)
|
283 |
-
* Size:
|
284 |
* Columns: <code>query</code>, <code>positive</code>, and <code>negative</code>
|
285 |
* Approximate statistics based on the first 1000 samples:
|
286 |
| | query | positive | negative |
|
@@ -301,10 +327,34 @@ You can finetune this model on your own dataset.
|
|
301 |
}
|
302 |
```
|
303 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
304 |
#### sentence-transformers/gooaq
|
305 |
|
306 |
* Dataset: [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
|
307 |
-
* Size:
|
308 |
* Columns: <code>question</code> and <code>answer</code>
|
309 |
* Approximate statistics based on the first 1000 samples:
|
310 |
| | question | answer |
|
@@ -355,16 +405,16 @@ You can finetune this model on your own dataset.
|
|
355 |
* Size: 425,243 training samples
|
356 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
357 |
* Approximate statistics based on the first 1000 samples:
|
358 |
-
| | sentence1 | sentence2
|
359 |
-
|
360 |
-
| type | string | string
|
361 |
-
| details | <ul><li>min:
|
362 |
* Samples:
|
363 |
-
| sentence1
|
364 |
-
|
365 |
-
| <code>
|
366 |
-
| <code
|
367 |
-
| <code>
|
368 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
369 |
|
370 |
#### merged-3l-nli
|
@@ -376,13 +426,13 @@ You can finetune this model on your own dataset.
|
|
376 |
| | sentence1 | sentence2 | label |
|
377 |
|:--------|:-------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
|
378 |
| type | string | string | int |
|
379 |
-
| details | <ul><li>min: 5 tokens</li><li>mean:
|
380 |
* Samples:
|
381 |
-
| sentence1
|
382 |
-
|
383 |
-
| <code>
|
384 |
-
| <code>The
|
385 |
-
| <code>
|
386 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
387 |
|
388 |
#### zero-shot-label-nli
|
@@ -394,13 +444,13 @@ You can finetune this model on your own dataset.
|
|
394 |
| | label | sentence1 | sentence2 |
|
395 |
|:--------|:------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
|
396 |
| type | int | string | string |
|
397 |
-
| details | <ul><li>0: ~
|
398 |
* Samples:
|
399 |
-
| label | sentence1
|
400 |
-
|
401 |
-
| <code>
|
402 |
-
| <code>2</code> | <code>
|
403 |
-
| <code>2</code> | <code>
|
404 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
405 |
|
406 |
#### dataset_train_nli
|
@@ -463,16 +513,16 @@ You can finetune this model on your own dataset.
|
|
463 |
* Size: 363,846 training samples
|
464 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
465 |
* Approximate statistics based on the first 1000 samples:
|
466 |
-
| | sentence1
|
467 |
-
|
468 |
-
| type | string
|
469 |
-
| details | <ul><li>min:
|
470 |
* Samples:
|
471 |
-
| sentence1
|
472 |
-
|
473 |
-
| <code>
|
474 |
-
| <code>
|
475 |
-
| <code>
|
476 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
477 |
|
478 |
#### fever-evidence-related
|
@@ -481,16 +531,16 @@ You can finetune this model on your own dataset.
|
|
481 |
* Size: 403,218 training samples
|
482 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
483 |
* Approximate statistics based on the first 1000 samples:
|
484 |
-
| | sentence1 | sentence2
|
485 |
-
|
486 |
-
| type | string | string
|
487 |
-
| details | <ul><li>min:
|
488 |
* Samples:
|
489 |
-
| sentence1
|
490 |
-
|
491 |
-
| <code>The
|
492 |
-
| <code>
|
493 |
-
| <code>
|
494 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
495 |
|
496 |
#### glue/stsb
|
@@ -499,16 +549,16 @@ You can finetune this model on your own dataset.
|
|
499 |
* Size: 5,749 training samples
|
500 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
501 |
* Approximate statistics based on the first 1000 samples:
|
502 |
-
| | sentence1
|
503 |
-
|
504 |
-
| type | string
|
505 |
-
| details | <ul><li>min: 6 tokens</li><li>mean: 15.
|
506 |
* Samples:
|
507 |
-
| sentence1
|
508 |
-
|
509 |
-
| <code>
|
510 |
-
| <code>
|
511 |
-
| <code>
|
512 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
513 |
```json
|
514 |
{
|
@@ -522,16 +572,16 @@ You can finetune this model on your own dataset.
|
|
522 |
* Size: 4,439 training samples
|
523 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
524 |
* Approximate statistics based on the first 1000 samples:
|
525 |
-
| | sentence1 | sentence2 | label
|
526 |
-
|
527 |
-
| type | string | string | float
|
528 |
-
| details | <ul><li>min: 6 tokens</li><li>mean: 12.
|
529 |
* Samples:
|
530 |
-
| sentence1
|
531 |
-
|
532 |
-
| <code>The
|
533 |
-
| <code>A man is
|
534 |
-
| <code>A
|
535 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
536 |
```json
|
537 |
{
|
@@ -548,13 +598,13 @@ You can finetune this model on your own dataset.
|
|
548 |
| | label | sentence1 | sentence2 |
|
549 |
|:--------|:---------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
550 |
| type | float | string | string |
|
551 |
-
| details | <ul><li>min: 0.0</li><li>mean: 3.
|
552 |
* Samples:
|
553 |
-
| label
|
554 |
-
|
555 |
-
| <code>4.
|
556 |
-
| <code>4.
|
557 |
-
| <code>3.
|
558 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
559 |
```json
|
560 |
{
|
@@ -781,7 +831,7 @@ You can finetune this model on your own dataset.
|
|
781 |
#### Non-Default Hyperparameters
|
782 |
|
783 |
- `per_device_train_batch_size`: 24
|
784 |
-
- `learning_rate`:
|
785 |
- `weight_decay`: 1e-06
|
786 |
- `num_train_epochs`: 1
|
787 |
- `warmup_ratio`: 0.1
|
@@ -801,7 +851,7 @@ You can finetune this model on your own dataset.
|
|
801 |
- `gradient_accumulation_steps`: 1
|
802 |
- `eval_accumulation_steps`: None
|
803 |
- `torch_empty_cache_steps`: None
|
804 |
-
- `learning_rate`:
|
805 |
- `weight_decay`: 1e-06
|
806 |
- `adam_beta1`: 0.9
|
807 |
- `adam_beta2`: 0.999
|
@@ -909,26 +959,373 @@ You can finetune this model on your own dataset.
|
|
909 |
</details>
|
910 |
|
911 |
### Training Logs
|
912 |
-
|
913 |
-
|:------:|:----:|:-------------:|
|
914 |
-
| 0.0067 | 500 | 10.6192 |
|
915 |
-
| 0.0134 | 1000 | 1.9196 |
|
916 |
-
| 0.0202 | 1500 | 1.0304 |
|
917 |
-
| 0.0269 | 2000 | 0.9269 |
|
918 |
-
| 0.0336 | 2500 | 0.7738 |
|
919 |
-
| 0.0403 | 3000 | 0.7092 |
|
920 |
-
| 0.0471 | 3500 | 0.6571 |
|
921 |
-
| 0.0538 | 4000 | 0.6408 |
|
922 |
-
| 0.0605 | 4500 | 0.6348 |
|
923 |
-
| 0.0672 | 5000 | 0.5927 |
|
924 |
-
| 0.0739 | 5500 | 0.5848 |
|
925 |
-
| 0.0807 | 6000 | 0.5542 |
|
926 |
-
| 0.0874 | 6500 | 0.558 |
|
927 |
-
| 0.0941 | 7000 | 0.5394 |
|
928 |
-
| 0.1008 | 7500 | 0.5632 |
|
929 |
-
| 0.1076 | 8000 | 0.5037 |
|
930 |
-
| 0.1143 | 8500 | 0.5278 |
|
931 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
932 |
|
933 |
### Framework Versions
|
934 |
- Python: 3.11.4
|
|
|
6 |
- sentence-similarity
|
7 |
- feature-extraction
|
8 |
- generated_from_trainer
|
9 |
+
- dataset_size:6131012
|
10 |
- loss:MultipleNegativesRankingLoss
|
11 |
- loss:CachedMultipleNegativesRankingLoss
|
12 |
- loss:SoftmaxLoss
|
13 |
- loss:CosineSimilarityLoss
|
14 |
base_model: tasksource/ModernBERT-base-nli
|
15 |
widget:
|
16 |
+
- source_sentence: Daniel went to the kitchen. Sandra went back to the kitchen. Daniel
|
17 |
+
moved to the garden. Sandra grabbed the apple. Sandra went back to the office.
|
18 |
+
Sandra dropped the apple. Sandra went to the garden. Sandra went back to the bedroom.
|
19 |
+
Sandra went back to the office. Mary went back to the office. Daniel moved to
|
20 |
+
the bathroom. Sandra grabbed the apple. Sandra travelled to the garden. Sandra
|
21 |
+
put down the apple there. Mary went back to the bathroom. Daniel travelled to
|
22 |
+
the garden. Mary took the milk. Sandra grabbed the apple. Mary left the milk there.
|
23 |
+
Sandra journeyed to the bedroom. John travelled to the office. John went back
|
24 |
+
to the garden. Sandra journeyed to the garden. Mary grabbed the milk. Mary left
|
25 |
+
the milk. Mary grabbed the milk. Mary went to the hallway. John moved to the hallway.
|
26 |
+
Mary picked up the football. Sandra journeyed to the kitchen. Sandra left the
|
27 |
+
apple. Mary discarded the milk. John journeyed to the garden. Mary dropped the
|
28 |
+
football. Daniel moved to the bathroom. Daniel journeyed to the kitchen. Mary
|
29 |
+
travelled to the bathroom. Daniel went to the bedroom. Mary went to the hallway.
|
30 |
+
Sandra got the apple. Sandra went back to the hallway. Mary moved to the kitchen.
|
31 |
+
Sandra dropped the apple there. Sandra grabbed the milk. Sandra journeyed to the
|
32 |
+
bathroom. John went back to the kitchen. Sandra went to the kitchen. Sandra travelled
|
33 |
+
to the bathroom. Daniel went to the garden. Daniel moved to the kitchen. Sandra
|
34 |
+
dropped the milk. Sandra got the milk. Sandra put down the milk. John journeyed
|
35 |
+
to the garden. Sandra went back to the hallway. Sandra picked up the apple. Sandra
|
36 |
+
got the football. Sandra moved to the garden. Daniel moved to the bathroom. Daniel
|
37 |
+
travelled to the garden. Sandra went back to the bathroom. Sandra discarded the
|
38 |
+
football.
|
39 |
sentences:
|
40 |
+
- In the adulthood stage, it can jump, walk, run
|
41 |
+
- The chocolate is bigger than the container.
|
42 |
+
- The football before the bathroom was in the garden.
|
43 |
+
- source_sentence: 'Context: I am devasted.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
+
Speaker 1: I am very devastated these days.
|
46 |
|
47 |
+
Speaker 2: That seems bad and I am sorry to hear that. What happened?
|
48 |
|
49 |
+
Speaker 1: My father day 3 weeks ago.I still can''t believe.
|
50 |
|
51 |
+
Speaker 2: I am truly sorry to hear that. Please accept my apologies for your
|
52 |
+
loss. May he rest in peace'
|
53 |
+
sentences:
|
54 |
+
- 'The main emotion of this example dialogue is: content'
|
55 |
+
- 'This text is about: genealogy'
|
56 |
+
- The intent of this example is to be offensive/disrespectful.
|
57 |
+
- source_sentence: in three distinguish’d parts, with three distinguish’d guides
|
58 |
sentences:
|
59 |
+
- This example is paraphrase.
|
60 |
+
- This example is neutral.
|
61 |
+
- This example is negative.
|
62 |
+
- source_sentence: A boy is playing a piano.
|
|
|
63 |
sentences:
|
64 |
+
- Nine killed in Syrian-linked clashes in Lebanon
|
65 |
+
- A man is singing and playing a guitar.
|
66 |
+
- My opinion is to wait until the child itself expresses a desire for this.
|
67 |
+
- source_sentence: Francis I of France was a king.
|
|
|
68 |
sentences:
|
69 |
+
- The Apple QuickTake -LRB- codenamed Venus , Mars , Neptune -RRB- is one of the
|
70 |
+
first consumer digital camera lines .. digital camera. digital camera. It was
|
71 |
+
launched in 1994 by Apple Computer and was marketed for three years before being
|
72 |
+
discontinued in 1997 .. Apple Computer. Apple Computer. Three models of the product
|
73 |
+
were built including the 100 and 150 , both built by Kodak ; and the 200 , built
|
74 |
+
by Fujifilm .. Kodak. Kodak. Fujifilm. Fujifilm. The QuickTake cameras had a resolution
|
75 |
+
of 640 x 480 pixels maximum -LRB- 0.3 Mpx -RRB- .. resolution. Display resolution.
|
76 |
+
The 200 model is only officially compatible with the Apple Macintosh for direct
|
77 |
+
connections , while the 100 and 150 model are compatible with both the Apple Macintosh
|
78 |
+
and Microsoft Windows .. Apple Macintosh. Apple Macintosh. Microsoft Windows.
|
79 |
+
Microsoft Windows. Because the QuickTake 200 is almost identical to the Fuji DS-7
|
80 |
+
or to Samsung 's Kenox SSC-350N , Fuji 's software for that camera can be used
|
81 |
+
to gain Windows compatibility for the QuickTake 200 .. Some other software replacements
|
82 |
+
also exist as well as using an external reader for the removable media of the
|
83 |
+
QuickTake 200 .. Time Magazine profiled QuickTake as `` the first consumer digital
|
84 |
+
camera '' and ranked it among its `` 100 greatest and most influential gadgets
|
85 |
+
from 1923 to the present '' list .. digital camera. digital camera. Time Magazine.
|
86 |
+
Time Magazine. While the QuickTake was probably the first digicam to have wide
|
87 |
+
success , technically this is not true as the greyscale Dycam Model 1 -LRB- also
|
88 |
+
marketed as the Logitech FotoMan -RRB- was the first consumer digital camera to
|
89 |
+
be sold in the US in November 1990 .. digital camera. digital camera. greyscale.
|
90 |
+
greyscale. At least one other camera , the Fuji DS-X , was sold in Japan even
|
91 |
+
earlier , in late 1989 .
|
92 |
+
- The ganglion cell layer -LRB- ganglionic layer -RRB- is a layer of the retina
|
93 |
+
that consists of retinal ganglion cells and displaced amacrine cells .. retina.
|
94 |
+
retina. In the macula lutea , the layer forms several strata .. macula lutea.
|
95 |
+
macula lutea. The cells are somewhat flask-shaped ; the rounded internal surface
|
96 |
+
of each resting on the stratum opticum , and sending off an axon which is prolonged
|
97 |
+
into it .. flask. Laboratory flask. stratum opticum. stratum opticum. axon. axon.
|
98 |
+
From the opposite end numerous dendrites extend into the inner plexiform layer
|
99 |
+
, where they branch and form flattened arborizations at different levels .. inner
|
100 |
+
plexiform layer. inner plexiform layer. arborizations. arborizations. dendrites.
|
101 |
+
dendrites. The ganglion cells vary much in size , and the dendrites of the smaller
|
102 |
+
ones as a rule arborize in the inner plexiform layer as soon as they enter it
|
103 |
+
; while those of the larger cells ramify close to the inner nuclear layer .. inner
|
104 |
+
plexiform layer. inner plexiform layer. dendrites. dendrites. inner nuclear layer.
|
105 |
+
inner nuclear layer
|
106 |
+
- Coyote was a brand of racing chassis designed and built for the use of A. J. Foyt
|
107 |
+
's race team in USAC Championship car racing including the Indianapolis 500 ..
|
108 |
+
A. J. Foyt. A. J. Foyt. USAC. United States Auto Club. Championship car. American
|
109 |
+
Championship car racing. Indianapolis 500. Indianapolis 500. It was used from
|
110 |
+
1966 to 1983 with Foyt himself making 141 starts in the car , winning 25 times
|
111 |
+
.. George Snider had the second most starts with 24 .. George Snider. George Snider.
|
112 |
+
Jim McElreath has the only other win with a Coyote chassis .. Jim McElreath. Jim
|
113 |
+
McElreath. Foyt drove a Coyote to victory in the Indy 500 in 1967 and 1977 ..
|
114 |
+
With Foyt 's permission , fellow Indy 500 champion Eddie Cheever 's Cheever Racing
|
115 |
+
began using the Coyote name for his new Daytona Prototype chassis , derived from
|
116 |
+
the Fabcar chassis design that he had purchased the rights to in 2007 .. Eddie
|
117 |
+
Cheever. Eddie Cheever. Cheever Racing. Cheever Racing. Daytona Prototype. Daytona
|
118 |
+
Prototype
|
119 |
datasets:
|
120 |
- tomaarsen/natural-questions-hard-negatives
|
121 |
- tomaarsen/gooaq-hard-negatives
|
122 |
- bclavie/msmarco-500k-triplets
|
123 |
+
- sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1
|
124 |
- sentence-transformers/gooaq
|
125 |
- sentence-transformers/natural-questions
|
126 |
- tasksource/merged-2l-nli
|
|
|
137 |
|
138 |
# SentenceTransformer based on tasksource/ModernBERT-base-nli
|
139 |
|
140 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [tasksource/ModernBERT-base-nli](https://huggingface.co/tasksource/ModernBERT-base-nli) on the [tomaarsen/natural-questions-hard-negatives](https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives), [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives), [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets), [sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1), [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq), [sentence-transformers/natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions), [merged-2l-nli](https://huggingface.co/datasets/tasksource/merged-2l-nli), [merged-3l-nli](https://huggingface.co/datasets/tasksource/merged-3l-nli), [zero-shot-label-nli](https://huggingface.co/datasets/tasksource/zero-shot-label-nli), [dataset_train_nli](https://huggingface.co/datasets/MoritzLaurer/dataset_train_nli), [paws/labeled_final](https://huggingface.co/datasets/paws), [glue/mrpc](https://huggingface.co/datasets/glue), [glue/qqp](https://huggingface.co/datasets/glue), [fever-evidence-related](https://huggingface.co/datasets/mwong/fever-evidence-related), [glue/stsb](https://huggingface.co/datasets/glue), sick/relatedness and [sts-companion](https://huggingface.co/datasets/tasksource/sts-companion) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
141 |
|
142 |
## Model Details
|
143 |
|
|
|
151 |
- [tomaarsen/natural-questions-hard-negatives](https://huggingface.co/datasets/tomaarsen/natural-questions-hard-negatives)
|
152 |
- [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives)
|
153 |
- [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets)
|
154 |
+
- [sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1)
|
155 |
- [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq)
|
156 |
- [sentence-transformers/natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions)
|
157 |
- [merged-2l-nli](https://huggingface.co/datasets/tasksource/merged-2l-nli)
|
|
|
201 |
model = SentenceTransformer("tasksource/ModernBERT-base-embed")
|
202 |
# Run inference
|
203 |
sentences = [
|
204 |
+
'Francis I of France was a king.',
|
205 |
+
"Coyote was a brand of racing chassis designed and built for the use of A. J. Foyt 's race team in USAC Championship car racing including the Indianapolis 500 .. A. J. Foyt. A. J. Foyt. USAC. United States Auto Club. Championship car. American Championship car racing. Indianapolis 500. Indianapolis 500. It was used from 1966 to 1983 with Foyt himself making 141 starts in the car , winning 25 times .. George Snider had the second most starts with 24 .. George Snider. George Snider. Jim McElreath has the only other win with a Coyote chassis .. Jim McElreath. Jim McElreath. Foyt drove a Coyote to victory in the Indy 500 in 1967 and 1977 .. With Foyt 's permission , fellow Indy 500 champion Eddie Cheever 's Cheever Racing began using the Coyote name for his new Daytona Prototype chassis , derived from the Fabcar chassis design that he had purchased the rights to in 2007 .. Eddie Cheever. Eddie Cheever. Cheever Racing. Cheever Racing. Daytona Prototype. Daytona Prototype",
|
206 |
+
"The Apple QuickTake -LRB- codenamed Venus , Mars , Neptune -RRB- is one of the first consumer digital camera lines .. digital camera. digital camera. It was launched in 1994 by Apple Computer and was marketed for three years before being discontinued in 1997 .. Apple Computer. Apple Computer. Three models of the product were built including the 100 and 150 , both built by Kodak ; and the 200 , built by Fujifilm .. Kodak. Kodak. Fujifilm. Fujifilm. The QuickTake cameras had a resolution of 640 x 480 pixels maximum -LRB- 0.3 Mpx -RRB- .. resolution. Display resolution. The 200 model is only officially compatible with the Apple Macintosh for direct connections , while the 100 and 150 model are compatible with both the Apple Macintosh and Microsoft Windows .. Apple Macintosh. Apple Macintosh. Microsoft Windows. Microsoft Windows. Because the QuickTake 200 is almost identical to the Fuji DS-7 or to Samsung 's Kenox SSC-350N , Fuji 's software for that camera can be used to gain Windows compatibility for the QuickTake 200 .. Some other software replacements also exist as well as using an external reader for the removable media of the QuickTake 200 .. Time Magazine profiled QuickTake as `` the first consumer digital camera '' and ranked it among its `` 100 greatest and most influential gadgets from 1923 to the present '' list .. digital camera. digital camera. Time Magazine. Time Magazine. While the QuickTake was probably the first digicam to have wide success , technically this is not true as the greyscale Dycam Model 1 -LRB- also marketed as the Logitech FotoMan -RRB- was the first consumer digital camera to be sold in the US in November 1990 .. digital camera. digital camera. greyscale. greyscale. At least one other camera , the Fuji DS-X , was sold in Japan even earlier , in late 1989 .",
|
207 |
]
|
208 |
embeddings = model.encode(sentences)
|
209 |
print(embeddings.shape)
|
|
|
282 |
#### tomaarsen/gooaq-hard-negatives
|
283 |
|
284 |
* Dataset: [tomaarsen/gooaq-hard-negatives](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives) at [87594a1](https://huggingface.co/datasets/tomaarsen/gooaq-hard-negatives/tree/87594a1e6c58e88b5843afa9da3a97ffd75d01c2)
|
285 |
+
* Size: 500,000 training samples
|
286 |
* Columns: <code>question</code>, <code>answer</code>, <code>negative_1</code>, <code>negative_2</code>, <code>negative_3</code>, <code>negative_4</code>, and <code>negative_5</code>
|
287 |
* Approximate statistics based on the first 1000 samples:
|
288 |
| | question | answer | negative_1 | negative_2 | negative_3 | negative_4 | negative_5 |
|
|
|
306 |
#### bclavie/msmarco-500k-triplets
|
307 |
|
308 |
* Dataset: [bclavie/msmarco-500k-triplets](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets) at [cb1a85c](https://huggingface.co/datasets/bclavie/msmarco-500k-triplets/tree/cb1a85c1261fa7c65f4ea43f94e50f8b467c372f)
|
309 |
+
* Size: 500,000 training samples
|
310 |
* Columns: <code>query</code>, <code>positive</code>, and <code>negative</code>
|
311 |
* Approximate statistics based on the first 1000 samples:
|
312 |
| | query | positive | negative |
|
|
|
327 |
}
|
328 |
```
|
329 |
|
330 |
+
#### sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1
|
331 |
+
|
332 |
+
* Dataset: [sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1) at [84ed2d3](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1/tree/84ed2d35626f617d890bd493b4d6db69a741e0e2)
|
333 |
+
* Size: 500,000 training samples
|
334 |
+
* Columns: <code>query</code>, <code>positive</code>, and <code>negative</code>
|
335 |
+
* Approximate statistics based on the first 1000 samples:
|
336 |
+
| | query | positive | negative |
|
337 |
+
|:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
|
338 |
+
| type | string | string | string |
|
339 |
+
| details | <ul><li>min: 5 tokens</li><li>mean: 9.87 tokens</li><li>max: 16 tokens</li></ul> | <ul><li>min: 44 tokens</li><li>mean: 85.25 tokens</li><li>max: 211 tokens</li></ul> | <ul><li>min: 18 tokens</li><li>mean: 81.18 tokens</li><li>max: 227 tokens</li></ul> |
|
340 |
+
* Samples:
|
341 |
+
| query | positive | negative |
|
342 |
+
|:----------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
343 |
+
| <code>what are the liberal arts?</code> | <code>liberal arts. 1. the academic course of instruction at a college intended to provide general knowledge and comprising the arts, humanities, natural sciences, and social sciences, as opposed to professional or technical subjects.</code> | <code>Rather than preparing students for a specific career, liberal arts programs focus on cultural literacy and hone communication and analytical skills. They often cover various disciplines, ranging from the humanities to social sciences. 1 Program Levels in Liberal Arts: Associate degree, Bachelor's degree, Master's degree.</code> |
|
344 |
+
| <code>what are the liberal arts?</code> | <code>liberal arts. 1. the academic course of instruction at a college intended to provide general knowledge and comprising the arts, humanities, natural sciences, and social sciences, as opposed to professional or technical subjects.</code> | <code>Artes Liberales: The historical basis for the modern liberal arts, consisting of the trivium (grammar, logic, and rhetoric) and the quadrivium (arithmetic, geometry, astronomy, and music). General Education: That part of a liberal education curriculum that is shared by all students.</code> |
|
345 |
+
| <code>what are the liberal arts?</code> | <code>liberal arts. 1. the academic course of instruction at a college intended to provide general knowledge and comprising the arts, humanities, natural sciences, and social sciences, as opposed to professional or technical subjects.</code> | <code>Liberal Arts. Upon completion of the Liberal Arts degree, students will be able to express ideas in coherent, creative, and appropriate forms, orally and in writing. Students will be able to apply their reading abilities in order to interconnect an understanding of resources to academic, professional, and personal interests.</code> |
|
346 |
+
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
347 |
+
```json
|
348 |
+
{
|
349 |
+
"scale": 20.0,
|
350 |
+
"similarity_fct": "cos_sim"
|
351 |
+
}
|
352 |
+
```
|
353 |
+
|
354 |
#### sentence-transformers/gooaq
|
355 |
|
356 |
* Dataset: [sentence-transformers/gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq) at [b089f72](https://huggingface.co/datasets/sentence-transformers/gooaq/tree/b089f728748a068b7bc5234e5bcf5b25e3c8279c)
|
357 |
+
* Size: 500,000 training samples
|
358 |
* Columns: <code>question</code> and <code>answer</code>
|
359 |
* Approximate statistics based on the first 1000 samples:
|
360 |
| | question | answer |
|
|
|
405 |
* Size: 425,243 training samples
|
406 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
407 |
* Approximate statistics based on the first 1000 samples:
|
408 |
+
| | sentence1 | sentence2 | label |
|
409 |
+
|:--------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:------------------------------------------------|
|
410 |
+
| type | string | string | int |
|
411 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 72.83 tokens</li><li>max: 1219 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 12.78 tokens</li><li>max: 118 tokens</li></ul> | <ul><li>0: ~55.50%</li><li>1: ~44.50%</li></ul> |
|
412 |
* Samples:
|
413 |
+
| sentence1 | sentence2 | label |
|
414 |
+
|:---------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
415 |
+
| <code>What type of food was cheese considered to be in Rome?</code> | <code>The staple foods were generally consumed around 11 o'clock, and consisted of bread, lettuce, cheese, fruits, nuts, and cold meat left over from the dinner the night before.[citation needed]</code> | <code>1</code> |
|
416 |
+
| <code>No Weapons of Mass Destruction Found in Iraq Yet.</code> | <code>Weapons of Mass Destruction Found in Iraq.</code> | <code>0</code> |
|
417 |
+
| <code>I stuck a pin through a carrot. When I pulled the pin out, it had a hole.</code> | <code>The carrot had a hole.</code> | <code>1</code> |
|
418 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
419 |
|
420 |
#### merged-3l-nli
|
|
|
426 |
| | sentence1 | sentence2 | label |
|
427 |
|:--------|:-------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
|
428 |
| type | string | string | int |
|
429 |
+
| details | <ul><li>min: 5 tokens</li><li>mean: 114.98 tokens</li><li>max: 2048 tokens</li></ul> | <ul><li>min: 2 tokens</li><li>mean: 28.37 tokens</li><li>max: 570 tokens</li></ul> | <ul><li>0: ~36.00%</li><li>1: ~31.50%</li><li>2: ~32.50%</li></ul> |
|
430 |
* Samples:
|
431 |
+
| sentence1 | sentence2 | label |
|
432 |
+
|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------|:---------------|
|
433 |
+
| <code>Over the nave, the two hollow pyramids appear to be designed in the style of chimneys for a castle kitchen.</code> | <code>There are seven pyramids there.</code> | <code>2</code> |
|
434 |
+
| <code>The Catch of the Season is an Edwardian musical comedy by Seymour Hicks and Cosmo Hamilton, with music by Herbert Haines and Evelyn Baker and lyrics by Charles H. Taylor, based on the fairy tale Cinderella. A debutante is engaged to a young aristocrat but loves a page.</code> | <code>Seymour Hicks was alive in 1975.</code> | <code>1</code> |
|
435 |
+
| <code>A 3600 g infant is heavy. A 2400 g infant is light.</code> | <code>A 2220 g bicycle is light.</code> | <code>1</code> |
|
436 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
437 |
|
438 |
#### zero-shot-label-nli
|
|
|
444 |
| | label | sentence1 | sentence2 |
|
445 |
|:--------|:------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|
|
446 |
| type | int | string | string |
|
447 |
+
| details | <ul><li>0: ~49.30%</li><li>2: ~50.70%</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 77.36 tokens</li><li>max: 2048 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 8.08 tokens</li><li>max: 17 tokens</li></ul> |
|
448 |
* Samples:
|
449 |
+
| label | sentence1 | sentence2 |
|
450 |
+
|:---------------|:-----------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------|
|
451 |
+
| <code>2</code> | <code>okay</code> | <code>This example is reply_y.</code> |
|
452 |
+
| <code>2</code> | <code>We retrospectively compared 2 methods that have been proposed to screen for IA [1, 2].</code> | <code>This example is background.</code> |
|
453 |
+
| <code>2</code> | <code>PersonX puts it under PersonX's pillow PersonX then checks it again<br>Person X suffers from obsessive compulsive disorder.</code> | <code>This example is weakener.</code> |
|
454 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
455 |
|
456 |
#### dataset_train_nli
|
|
|
513 |
* Size: 363,846 training samples
|
514 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
515 |
* Approximate statistics based on the first 1000 samples:
|
516 |
+
| | sentence1 | sentence2 | label |
|
517 |
+
|:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
|
518 |
+
| type | string | string | int |
|
519 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 15.9 tokens</li><li>max: 53 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.73 tokens</li><li>max: 72 tokens</li></ul> | <ul><li>0: ~61.90%</li><li>1: ~38.10%</li></ul> |
|
520 |
* Samples:
|
521 |
+
| sentence1 | sentence2 | label |
|
522 |
+
|:------------------------------------------------------|:---------------------------------------------------------|:---------------|
|
523 |
+
| <code>What are reviews of Big Data University?</code> | <code>What is your review of Big Data University?</code> | <code>1</code> |
|
524 |
+
| <code>What are glass bottles made of?</code> | <code>How is a glass bottle made?</code> | <code>0</code> |
|
525 |
+
| <code>What do you really know about Algeria?</code> | <code>What do you know about Algeria?</code> | <code>1</code> |
|
526 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
527 |
|
528 |
#### fever-evidence-related
|
|
|
531 |
* Size: 403,218 training samples
|
532 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
533 |
* Approximate statistics based on the first 1000 samples:
|
534 |
+
| | sentence1 | sentence2 | label |
|
535 |
+
|:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:------------------------------------------------|
|
536 |
+
| type | string | string | int |
|
537 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 13.63 tokens</li><li>max: 39 tokens</li></ul> | <ul><li>min: 9 tokens</li><li>mean: 350.02 tokens</li><li>max: 2048 tokens</li></ul> | <ul><li>0: ~31.80%</li><li>1: ~68.20%</li></ul> |
|
538 |
* Samples:
|
539 |
+
| sentence1 | sentence2 | label |
|
540 |
+
|:--------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
|
541 |
+
| <code>The Bridges of Madison County is a TV series.</code> | <code>Saulsbury is a town in Hardeman County , Tennessee .. Hardeman County. Hardeman County, Tennessee. Tennessee. Tennessee. County. List of counties in Tennessee. Hardeman. Hardeman County, Tennessee. The population was 99 at the 2000 census and 81 at the 2010 census showing a decrease of 18 .. United States Census, 2010. It is located along State Highway 57 in southwest Hardeman County .. Hardeman County. Hardeman County, Tennessee. State. Political divisions of the United States. County. List of counties in Tennessee. Hardeman. Hardeman County, Tennessee. State Highway 57. State Highway 57</code> | <code>1</code> |
|
542 |
+
| <code>Jessica Lange's first film role was in Godzilla.</code> | <code>Haji Ahmadov -LRB- Hacı Əhmədov , born on 23 November 1993 in Baku , Soviet Union -RRB- is an Azerbaijani football defender who plays for AZAL .. Baku. Baku. AZAL. AZAL PFK. Soviet Union. Soviet Union. Azerbaijani. Azerbaijani people. football. football ( soccer ). defender. Defender ( football )</code> | <code>1</code> |
|
543 |
+
| <code>Brad Pitt directed 12 Years a Slave.</code> | <code>The Bronze Bauhinia Star -LRB- , BBS -RRB- is the lowest rank in Order of the Bauhinia Star in Hong Kong , created in 1997 to replace the British honours system of the Order of the British Empire after the transfer of sovereignty to People 's Republic of China and the establishment of the Hong Kong Special Administrative Region -LRB- HKSAR -RRB- .. Order of the Bauhinia Star. Order of the Bauhinia Star. British honours system. British honours system. Order of the British Empire. Order of the British Empire. Special Administrative Region. Special Administrative Region of the People's Republic of China. It is awarded to persons who have given outstanding service over a long period of time , but in a more limited field or way than that required for the Silver Bauhinia Star .. Silver Bauhinia Star. Silver Bauhinia Star</code> | <code>1</code> |
|
544 |
* Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
|
545 |
|
546 |
#### glue/stsb
|
|
|
549 |
* Size: 5,749 training samples
|
550 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
551 |
* Approximate statistics based on the first 1000 samples:
|
552 |
+
| | sentence1 | sentence2 | label |
|
553 |
+
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
554 |
+
| type | string | string | float |
|
555 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 15.22 tokens</li><li>max: 74 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.04 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 2.74</li><li>max: 5.0</li></ul> |
|
556 |
* Samples:
|
557 |
+
| sentence1 | sentence2 | label |
|
558 |
+
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------|
|
559 |
+
| <code>Snowden Hits Hurdles in Search for Asylum</code> | <code>Snowden's hits hurdles in search for asylum</code> | <code>5.0</code> |
|
560 |
+
| <code>Ukrainian protesters back in streets for anti-government rally</code> | <code>Ukraine protesters topple Lenin statue in Kiev</code> | <code>2.5999999046325684</code> |
|
561 |
+
| <code>"Biotech products, if anything, may be safer than conventional products because of all the testing," Fraley said, adding that 18 countries have adopted biotechnology.</code> | <code>"Biotech products, if anything, may be safer than conventional products because of all the testing," said Robert Fraley, Monsanto's executive vice president.</code> | <code>3.200000047683716</code> |
|
562 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
563 |
```json
|
564 |
{
|
|
|
572 |
* Size: 4,439 training samples
|
573 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
|
574 |
* Approximate statistics based on the first 1000 samples:
|
575 |
+
| | sentence1 | sentence2 | label |
|
576 |
+
|:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
577 |
+
| type | string | string | float |
|
578 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 12.17 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 12.06 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 1.0</li><li>mean: 3.53</li><li>max: 5.0</li></ul> |
|
579 |
* Samples:
|
580 |
+
| sentence1 | sentence2 | label |
|
581 |
+
|:-----------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------|:-------------------------------|
|
582 |
+
| <code>The dark skinned male is standing on one hand in front of a yellow building</code> | <code>The dark skinned male is not standing on one hand in front of a yellow building</code> | <code>4.0</code> |
|
583 |
+
| <code>A man is singing and playing a guitar</code> | <code>A boy is skillfully playing a piano</code> | <code>2.299999952316284</code> |
|
584 |
+
| <code>A picture is being drawn by a man</code> | <code>The person is drawing</code> | <code>4.099999904632568</code> |
|
585 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
586 |
```json
|
587 |
{
|
|
|
598 |
| | label | sentence1 | sentence2 |
|
599 |
|:--------|:---------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
|
600 |
| type | float | string | string |
|
601 |
+
| details | <ul><li>min: 0.0</li><li>mean: 3.15</li><li>max: 5.0</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 18.78 tokens</li><li>max: 79 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 16.68 tokens</li><li>max: 71 tokens</li></ul> |
|
602 |
* Samples:
|
603 |
+
| label | sentence1 | sentence2 |
|
604 |
+
|:-----------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
605 |
+
| <code>4.6</code> | <code>As a matter of urgency, therefore, the staff complement of the Interdepartmental Group attached to the Commission Secretariat should be strengthened at the earliest possible opportunity in order to ensure that all proposals for acts which are general in scope are accompanied, when considered by the College of Commissioners and on the basis of Article 299(2), by a simplified sheet outlining their potential impact.</code> | <code>Thus, it is urgent that the inter-service group staff should be strengthened very quickly at the heart of the General Secretariat of the Commission, so that all proposals to act of general scope can be accompanied, during their examination by the college on the basis of Article 299(2), a detailed impact statement.</code> |
|
606 |
+
| <code>4.0</code> | <code>Reiterating the calls made by the European Parliament in its resolution of 16 March 2000, what initiatives does the Presidency of the European Council propose to take with a view to playing a more active role so as to guarantee the full and complete application of the UN peace plan?</code> | <code>As requested by the European Parliament in its resolution of 16 March 2000, that these initiatives the presidency of the European Council is going to take to play a more active role in order to ensure the full implementation of the UN peace plan?</code> |
|
607 |
+
| <code>3.2</code> | <code>Let us, as a Europe of 15 Member States, organise ourselves in order to be able to welcome those countries who are knocking at the door into the fold under respectable conditions.</code> | <code>Let us organise itself to 15 in order to be able to welcome the right conditions for countries which are knocking on our door.</code> |
|
608 |
* Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
|
609 |
```json
|
610 |
{
|
|
|
831 |
#### Non-Default Hyperparameters
|
832 |
|
833 |
- `per_device_train_batch_size`: 24
|
834 |
+
- `learning_rate`: 3.5e-05
|
835 |
- `weight_decay`: 1e-06
|
836 |
- `num_train_epochs`: 1
|
837 |
- `warmup_ratio`: 0.1
|
|
|
851 |
- `gradient_accumulation_steps`: 1
|
852 |
- `eval_accumulation_steps`: None
|
853 |
- `torch_empty_cache_steps`: None
|
854 |
+
- `learning_rate`: 3.5e-05
|
855 |
- `weight_decay`: 1e-06
|
856 |
- `adam_beta1`: 0.9
|
857 |
- `adam_beta2`: 0.999
|
|
|
959 |
</details>
|
960 |
|
961 |
### Training Logs
|
962 |
+
<details><summary>Click to expand</summary>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
963 |
|
964 |
+
| Epoch | Step | Training Loss |
|
965 |
+
|:------:|:------:|:-------------:|
|
966 |
+
| 0.0028 | 500 | 5.7412 |
|
967 |
+
| 0.0055 | 1000 | 2.2293 |
|
968 |
+
| 0.0083 | 1500 | 1.1572 |
|
969 |
+
| 0.0111 | 2000 | 0.9386 |
|
970 |
+
| 0.0138 | 2500 | 0.8352 |
|
971 |
+
| 0.0166 | 3000 | 0.7291 |
|
972 |
+
| 0.0194 | 3500 | 0.6555 |
|
973 |
+
| 0.0222 | 4000 | 0.6488 |
|
974 |
+
| 0.0249 | 4500 | 0.6267 |
|
975 |
+
| 0.0277 | 5000 | 0.5527 |
|
976 |
+
| 0.0305 | 5500 | 0.5985 |
|
977 |
+
| 0.0332 | 6000 | 0.5574 |
|
978 |
+
| 0.0360 | 6500 | 0.5642 |
|
979 |
+
| 0.0388 | 7000 | 0.5821 |
|
980 |
+
| 0.0415 | 7500 | 0.5289 |
|
981 |
+
| 0.0443 | 8000 | 0.5374 |
|
982 |
+
| 0.0471 | 8500 | 0.5187 |
|
983 |
+
| 0.0499 | 9000 | 0.5278 |
|
984 |
+
| 0.0526 | 9500 | 0.4983 |
|
985 |
+
| 0.0554 | 10000 | 0.4758 |
|
986 |
+
| 0.0582 | 10500 | 0.4939 |
|
987 |
+
| 0.0609 | 11000 | 0.4944 |
|
988 |
+
| 0.0637 | 11500 | 0.4967 |
|
989 |
+
| 0.0665 | 12000 | 0.4543 |
|
990 |
+
| 0.0692 | 12500 | 0.4649 |
|
991 |
+
| 0.0720 | 13000 | 0.4612 |
|
992 |
+
| 0.0748 | 13500 | 0.4612 |
|
993 |
+
| 0.0776 | 14000 | 0.4684 |
|
994 |
+
| 0.0803 | 14500 | 0.4904 |
|
995 |
+
| 0.0831 | 15000 | 0.4538 |
|
996 |
+
| 0.0859 | 15500 | 0.4388 |
|
997 |
+
| 0.0886 | 16000 | 0.4584 |
|
998 |
+
| 0.0914 | 16500 | 0.4728 |
|
999 |
+
| 0.0942 | 17000 | 0.4236 |
|
1000 |
+
| 0.0969 | 17500 | 0.4328 |
|
1001 |
+
| 0.0997 | 18000 | 0.4624 |
|
1002 |
+
| 0.1025 | 18500 | 0.4732 |
|
1003 |
+
| 0.1053 | 19000 | 0.4375 |
|
1004 |
+
| 0.1080 | 19500 | 0.4495 |
|
1005 |
+
| 0.1108 | 20000 | 0.4296 |
|
1006 |
+
| 0.1136 | 20500 | 0.4211 |
|
1007 |
+
| 0.1163 | 21000 | 0.4399 |
|
1008 |
+
| 0.1191 | 21500 | 0.4353 |
|
1009 |
+
| 0.1219 | 22000 | 0.4407 |
|
1010 |
+
| 0.1246 | 22500 | 0.3892 |
|
1011 |
+
| 0.1274 | 23000 | 0.4121 |
|
1012 |
+
| 0.1302 | 23500 | 0.4253 |
|
1013 |
+
| 0.1330 | 24000 | 0.4066 |
|
1014 |
+
| 0.1357 | 24500 | 0.4168 |
|
1015 |
+
| 0.1385 | 25000 | 0.3921 |
|
1016 |
+
| 0.1413 | 25500 | 0.4008 |
|
1017 |
+
| 0.1440 | 26000 | 0.4164 |
|
1018 |
+
| 0.1468 | 26500 | 0.4047 |
|
1019 |
+
| 0.1496 | 27000 | 0.4031 |
|
1020 |
+
| 0.1523 | 27500 | 0.3955 |
|
1021 |
+
| 0.1551 | 28000 | 0.3809 |
|
1022 |
+
| 0.1579 | 28500 | 0.3992 |
|
1023 |
+
| 0.1606 | 29000 | 0.3686 |
|
1024 |
+
| 0.1634 | 29500 | 0.3851 |
|
1025 |
+
| 0.1662 | 30000 | 0.3776 |
|
1026 |
+
| 0.1690 | 30500 | 0.3919 |
|
1027 |
+
| 0.1717 | 31000 | 0.4026 |
|
1028 |
+
| 0.1745 | 31500 | 0.38 |
|
1029 |
+
| 0.1773 | 32000 | 0.41 |
|
1030 |
+
| 0.1800 | 32500 | 0.3731 |
|
1031 |
+
| 0.1828 | 33000 | 0.3831 |
|
1032 |
+
| 0.1856 | 33500 | 0.3727 |
|
1033 |
+
| 0.1883 | 34000 | 0.3664 |
|
1034 |
+
| 0.1911 | 34500 | 0.3882 |
|
1035 |
+
| 0.1939 | 35000 | 0.3873 |
|
1036 |
+
| 0.1967 | 35500 | 0.3529 |
|
1037 |
+
| 0.1994 | 36000 | 0.3923 |
|
1038 |
+
| 0.2022 | 36500 | 0.4051 |
|
1039 |
+
| 0.2050 | 37000 | 0.4134 |
|
1040 |
+
| 0.2077 | 37500 | 0.3478 |
|
1041 |
+
| 0.2105 | 38000 | 0.3602 |
|
1042 |
+
| 0.2133 | 38500 | 0.3547 |
|
1043 |
+
| 0.2160 | 39000 | 0.3748 |
|
1044 |
+
| 0.2188 | 39500 | 0.3537 |
|
1045 |
+
| 0.2216 | 40000 | 0.38 |
|
1046 |
+
| 0.2244 | 40500 | 0.3731 |
|
1047 |
+
| 0.2271 | 41000 | 0.3537 |
|
1048 |
+
| 0.2299 | 41500 | 0.3576 |
|
1049 |
+
| 0.2327 | 42000 | 0.3626 |
|
1050 |
+
| 0.2354 | 42500 | 0.3587 |
|
1051 |
+
| 0.2382 | 43000 | 0.3488 |
|
1052 |
+
| 0.2410 | 43500 | 0.3694 |
|
1053 |
+
| 0.2437 | 44000 | 0.3508 |
|
1054 |
+
| 0.2465 | 44500 | 0.3634 |
|
1055 |
+
| 0.2493 | 45000 | 0.3608 |
|
1056 |
+
| 0.2521 | 45500 | 0.4007 |
|
1057 |
+
| 0.2548 | 46000 | 0.3559 |
|
1058 |
+
| 0.2576 | 46500 | 0.3317 |
|
1059 |
+
| 0.2604 | 47000 | 0.3518 |
|
1060 |
+
| 0.2631 | 47500 | 0.3578 |
|
1061 |
+
| 0.2659 | 48000 | 0.3375 |
|
1062 |
+
| 0.2687 | 48500 | 0.3229 |
|
1063 |
+
| 0.2714 | 49000 | 0.3319 |
|
1064 |
+
| 0.2742 | 49500 | 0.3656 |
|
1065 |
+
| 0.2770 | 50000 | 0.3598 |
|
1066 |
+
| 0.2798 | 50500 | 0.3705 |
|
1067 |
+
| 0.2825 | 51000 | 0.3431 |
|
1068 |
+
| 0.2853 | 51500 | 0.3587 |
|
1069 |
+
| 0.2881 | 52000 | 0.3361 |
|
1070 |
+
| 0.2908 | 52500 | 0.3734 |
|
1071 |
+
| 0.2936 | 53000 | 0.3361 |
|
1072 |
+
| 0.2964 | 53500 | 0.3322 |
|
1073 |
+
| 0.2991 | 54000 | 0.347 |
|
1074 |
+
| 0.3019 | 54500 | 0.3617 |
|
1075 |
+
| 0.3047 | 55000 | 0.3318 |
|
1076 |
+
| 0.3074 | 55500 | 0.3401 |
|
1077 |
+
| 0.3102 | 56000 | 0.328 |
|
1078 |
+
| 0.3130 | 56500 | 0.3553 |
|
1079 |
+
| 0.3158 | 57000 | 0.3669 |
|
1080 |
+
| 0.3185 | 57500 | 0.4088 |
|
1081 |
+
| 0.3213 | 58000 | 0.3636 |
|
1082 |
+
| 0.3241 | 58500 | 0.3372 |
|
1083 |
+
| 0.3268 | 59000 | 0.3494 |
|
1084 |
+
| 0.3296 | 59500 | 0.3504 |
|
1085 |
+
| 0.3324 | 60000 | 0.3389 |
|
1086 |
+
| 0.3351 | 60500 | 0.3219 |
|
1087 |
+
| 0.3379 | 61000 | 0.3283 |
|
1088 |
+
| 0.3407 | 61500 | 0.3202 |
|
1089 |
+
| 0.3435 | 62000 | 0.3185 |
|
1090 |
+
| 0.3462 | 62500 | 0.3449 |
|
1091 |
+
| 0.3490 | 63000 | 0.3527 |
|
1092 |
+
| 0.3518 | 63500 | 0.3349 |
|
1093 |
+
| 0.3545 | 64000 | 0.3225 |
|
1094 |
+
| 0.3573 | 64500 | 0.3269 |
|
1095 |
+
| 0.3601 | 65000 | 0.3074 |
|
1096 |
+
| 0.3628 | 65500 | 0.3513 |
|
1097 |
+
| 0.3656 | 66000 | 0.3166 |
|
1098 |
+
| 0.3684 | 66500 | 0.3472 |
|
1099 |
+
| 0.3712 | 67000 | 0.3395 |
|
1100 |
+
| 0.3739 | 67500 | 0.3437 |
|
1101 |
+
| 0.3767 | 68000 | 0.3491 |
|
1102 |
+
| 0.3795 | 68500 | 0.3181 |
|
1103 |
+
| 0.3822 | 69000 | 0.3324 |
|
1104 |
+
| 0.3850 | 69500 | 0.3335 |
|
1105 |
+
| 0.3878 | 70000 | 0.3401 |
|
1106 |
+
| 0.3905 | 70500 | 0.3433 |
|
1107 |
+
| 0.3933 | 71000 | 0.3229 |
|
1108 |
+
| 0.3961 | 71500 | 0.3264 |
|
1109 |
+
| 0.3989 | 72000 | 0.3123 |
|
1110 |
+
| 0.4016 | 72500 | 0.3207 |
|
1111 |
+
| 0.4044 | 73000 | 0.3008 |
|
1112 |
+
| 0.4072 | 73500 | 0.2998 |
|
1113 |
+
| 0.4099 | 74000 | 0.2992 |
|
1114 |
+
| 0.4127 | 74500 | 0.3134 |
|
1115 |
+
| 0.4155 | 75000 | 0.3262 |
|
1116 |
+
| 0.4182 | 75500 | 0.2988 |
|
1117 |
+
| 0.4210 | 76000 | 0.2936 |
|
1118 |
+
| 0.4238 | 76500 | 0.314 |
|
1119 |
+
| 0.4266 | 77000 | 0.3083 |
|
1120 |
+
| 0.4293 | 77500 | 0.3103 |
|
1121 |
+
| 0.4321 | 78000 | 0.3303 |
|
1122 |
+
| 0.4349 | 78500 | 0.3282 |
|
1123 |
+
| 0.4376 | 79000 | 0.3415 |
|
1124 |
+
| 0.4404 | 79500 | 0.3001 |
|
1125 |
+
| 0.4432 | 80000 | 0.321 |
|
1126 |
+
| 0.4459 | 80500 | 0.3219 |
|
1127 |
+
| 0.4487 | 81000 | 0.3477 |
|
1128 |
+
| 0.4515 | 81500 | 0.2871 |
|
1129 |
+
| 0.4542 | 82000 | 0.2913 |
|
1130 |
+
| 0.4570 | 82500 | 0.3121 |
|
1131 |
+
| 0.4598 | 83000 | 0.3057 |
|
1132 |
+
| 0.4626 | 83500 | 0.32 |
|
1133 |
+
| 0.4653 | 84000 | 0.3086 |
|
1134 |
+
| 0.4681 | 84500 | 0.3091 |
|
1135 |
+
| 0.4709 | 85000 | 0.3243 |
|
1136 |
+
| 0.4736 | 85500 | 0.3104 |
|
1137 |
+
| 0.4764 | 86000 | 0.3124 |
|
1138 |
+
| 0.4792 | 86500 | 0.3134 |
|
1139 |
+
| 0.4819 | 87000 | 0.2967 |
|
1140 |
+
| 0.4847 | 87500 | 0.3036 |
|
1141 |
+
| 0.4875 | 88000 | 0.3079 |
|
1142 |
+
| 0.4903 | 88500 | 0.2959 |
|
1143 |
+
| 0.4930 | 89000 | 0.3332 |
|
1144 |
+
| 0.4958 | 89500 | 0.3151 |
|
1145 |
+
| 0.4986 | 90000 | 0.3233 |
|
1146 |
+
| 0.5013 | 90500 | 0.3083 |
|
1147 |
+
| 0.5041 | 91000 | 0.2913 |
|
1148 |
+
| 0.5069 | 91500 | 0.31 |
|
1149 |
+
| 0.5096 | 92000 | 0.2962 |
|
1150 |
+
| 0.5124 | 92500 | 0.3254 |
|
1151 |
+
| 0.5152 | 93000 | 0.312 |
|
1152 |
+
| 0.5180 | 93500 | 0.3152 |
|
1153 |
+
| 0.5207 | 94000 | 0.3208 |
|
1154 |
+
| 0.5235 | 94500 | 0.3039 |
|
1155 |
+
| 0.5263 | 95000 | 0.3187 |
|
1156 |
+
| 0.5290 | 95500 | 0.3052 |
|
1157 |
+
| 0.5318 | 96000 | 0.3114 |
|
1158 |
+
| 0.5346 | 96500 | 0.315 |
|
1159 |
+
| 0.5373 | 97000 | 0.2862 |
|
1160 |
+
| 0.5401 | 97500 | 0.3104 |
|
1161 |
+
| 0.5429 | 98000 | 0.3 |
|
1162 |
+
| 0.5457 | 98500 | 0.3017 |
|
1163 |
+
| 0.5484 | 99000 | 0.3189 |
|
1164 |
+
| 0.5512 | 99500 | 0.2919 |
|
1165 |
+
| 0.5540 | 100000 | 0.2913 |
|
1166 |
+
| 0.5567 | 100500 | 0.2936 |
|
1167 |
+
| 0.5595 | 101000 | 0.3044 |
|
1168 |
+
| 0.5623 | 101500 | 0.3034 |
|
1169 |
+
| 0.5650 | 102000 | 0.2999 |
|
1170 |
+
| 0.5678 | 102500 | 0.2961 |
|
1171 |
+
| 0.5706 | 103000 | 0.328 |
|
1172 |
+
| 0.5734 | 103500 | 0.3061 |
|
1173 |
+
| 0.5761 | 104000 | 0.295 |
|
1174 |
+
| 0.5789 | 104500 | 0.2997 |
|
1175 |
+
| 0.5817 | 105000 | 0.2981 |
|
1176 |
+
| 0.5844 | 105500 | 0.2966 |
|
1177 |
+
| 0.5872 | 106000 | 0.2798 |
|
1178 |
+
| 0.5900 | 106500 | 0.3001 |
|
1179 |
+
| 0.5927 | 107000 | 0.3018 |
|
1180 |
+
| 0.5955 | 107500 | 0.3076 |
|
1181 |
+
| 0.5983 | 108000 | 0.3093 |
|
1182 |
+
| 0.6010 | 108500 | 0.3096 |
|
1183 |
+
| 0.6038 | 109000 | 0.2914 |
|
1184 |
+
| 0.6066 | 109500 | 0.2874 |
|
1185 |
+
| 0.6094 | 110000 | 0.2777 |
|
1186 |
+
| 0.6121 | 110500 | 0.2854 |
|
1187 |
+
| 0.6149 | 111000 | 0.3279 |
|
1188 |
+
| 0.6177 | 111500 | 0.2843 |
|
1189 |
+
| 0.6204 | 112000 | 0.2956 |
|
1190 |
+
| 0.6232 | 112500 | 0.3076 |
|
1191 |
+
| 0.6260 | 113000 | 0.314 |
|
1192 |
+
| 0.6287 | 113500 | 0.295 |
|
1193 |
+
| 0.6315 | 114000 | 0.2914 |
|
1194 |
+
| 0.6343 | 114500 | 0.3041 |
|
1195 |
+
| 0.6371 | 115000 | 0.2871 |
|
1196 |
+
| 0.6398 | 115500 | 0.3004 |
|
1197 |
+
| 0.6426 | 116000 | 0.2954 |
|
1198 |
+
| 0.6454 | 116500 | 0.2959 |
|
1199 |
+
| 0.6481 | 117000 | 0.3214 |
|
1200 |
+
| 0.6509 | 117500 | 0.2828 |
|
1201 |
+
| 0.6537 | 118000 | 0.3005 |
|
1202 |
+
| 0.6564 | 118500 | 0.2918 |
|
1203 |
+
| 0.6592 | 119000 | 0.2988 |
|
1204 |
+
| 0.6620 | 119500 | 0.2901 |
|
1205 |
+
| 0.6648 | 120000 | 0.2796 |
|
1206 |
+
| 0.6675 | 120500 | 0.2988 |
|
1207 |
+
| 0.6703 | 121000 | 0.2969 |
|
1208 |
+
| 0.6731 | 121500 | 0.2892 |
|
1209 |
+
| 0.6758 | 122000 | 0.2812 |
|
1210 |
+
| 0.6786 | 122500 | 0.2992 |
|
1211 |
+
| 0.6814 | 123000 | 0.2691 |
|
1212 |
+
| 0.6841 | 123500 | 0.2966 |
|
1213 |
+
| 0.6869 | 124000 | 0.2906 |
|
1214 |
+
| 0.6897 | 124500 | 0.2807 |
|
1215 |
+
| 0.6925 | 125000 | 0.2684 |
|
1216 |
+
| 0.6952 | 125500 | 0.2771 |
|
1217 |
+
| 0.6980 | 126000 | 0.2992 |
|
1218 |
+
| 0.7008 | 126500 | 0.274 |
|
1219 |
+
| 0.7035 | 127000 | 0.2846 |
|
1220 |
+
| 0.7063 | 127500 | 0.2898 |
|
1221 |
+
| 0.7091 | 128000 | 0.2795 |
|
1222 |
+
| 0.7118 | 128500 | 0.2758 |
|
1223 |
+
| 0.7146 | 129000 | 0.2883 |
|
1224 |
+
| 0.7174 | 129500 | 0.2968 |
|
1225 |
+
| 0.7201 | 130000 | 0.2756 |
|
1226 |
+
| 0.7229 | 130500 | 0.3116 |
|
1227 |
+
| 0.7257 | 131000 | 0.2923 |
|
1228 |
+
| 0.7285 | 131500 | 0.2758 |
|
1229 |
+
| 0.7312 | 132000 | 0.262 |
|
1230 |
+
| 0.7340 | 132500 | 0.283 |
|
1231 |
+
| 0.7368 | 133000 | 0.2937 |
|
1232 |
+
| 0.7395 | 133500 | 0.2891 |
|
1233 |
+
| 0.7423 | 134000 | 0.2743 |
|
1234 |
+
| 0.7451 | 134500 | 0.3087 |
|
1235 |
+
| 0.7478 | 135000 | 0.2855 |
|
1236 |
+
| 0.7506 | 135500 | 0.2902 |
|
1237 |
+
| 0.7534 | 136000 | 0.278 |
|
1238 |
+
| 0.7562 | 136500 | 0.2607 |
|
1239 |
+
| 0.7589 | 137000 | 0.2634 |
|
1240 |
+
| 0.7617 | 137500 | 0.2807 |
|
1241 |
+
| 0.7645 | 138000 | 0.294 |
|
1242 |
+
| 0.7672 | 138500 | 0.2837 |
|
1243 |
+
| 0.7700 | 139000 | 0.2521 |
|
1244 |
+
| 0.7728 | 139500 | 0.2751 |
|
1245 |
+
| 0.7755 | 140000 | 0.3012 |
|
1246 |
+
| 0.7783 | 140500 | 0.2816 |
|
1247 |
+
| 0.7811 | 141000 | 0.2756 |
|
1248 |
+
| 0.7839 | 141500 | 0.2661 |
|
1249 |
+
| 0.7866 | 142000 | 0.2585 |
|
1250 |
+
| 0.7894 | 142500 | 0.2718 |
|
1251 |
+
| 0.7922 | 143000 | 0.2724 |
|
1252 |
+
| 0.7949 | 143500 | 0.2804 |
|
1253 |
+
| 0.7977 | 144000 | 0.2582 |
|
1254 |
+
| 0.8005 | 144500 | 0.2636 |
|
1255 |
+
| 0.8032 | 145000 | 0.2536 |
|
1256 |
+
| 0.8060 | 145500 | 0.2862 |
|
1257 |
+
| 0.8088 | 146000 | 0.2842 |
|
1258 |
+
| 0.8116 | 146500 | 0.2702 |
|
1259 |
+
| 0.8143 | 147000 | 0.2727 |
|
1260 |
+
| 0.8171 | 147500 | 0.2591 |
|
1261 |
+
| 0.8199 | 148000 | 0.2709 |
|
1262 |
+
| 0.8226 | 148500 | 0.2879 |
|
1263 |
+
| 0.8254 | 149000 | 0.2669 |
|
1264 |
+
| 0.8282 | 149500 | 0.2748 |
|
1265 |
+
| 0.8309 | 150000 | 0.2689 |
|
1266 |
+
| 0.8337 | 150500 | 0.2414 |
|
1267 |
+
| 0.8365 | 151000 | 0.261 |
|
1268 |
+
| 0.8393 | 151500 | 0.2967 |
|
1269 |
+
| 0.8420 | 152000 | 0.2757 |
|
1270 |
+
| 0.8448 | 152500 | 0.2667 |
|
1271 |
+
| 0.8476 | 153000 | 0.252 |
|
1272 |
+
| 0.8503 | 153500 | 0.2659 |
|
1273 |
+
| 0.8531 | 154000 | 0.2799 |
|
1274 |
+
| 0.8559 | 154500 | 0.2653 |
|
1275 |
+
| 0.8586 | 155000 | 0.275 |
|
1276 |
+
| 0.8614 | 155500 | 0.3067 |
|
1277 |
+
| 0.8642 | 156000 | 0.2742 |
|
1278 |
+
| 0.8669 | 156500 | 0.2616 |
|
1279 |
+
| 0.8697 | 157000 | 0.2793 |
|
1280 |
+
| 0.8725 | 157500 | 0.2721 |
|
1281 |
+
| 0.8753 | 158000 | 0.2623 |
|
1282 |
+
| 0.8780 | 158500 | 0.2801 |
|
1283 |
+
| 0.8808 | 159000 | 0.2499 |
|
1284 |
+
| 0.8836 | 159500 | 0.283 |
|
1285 |
+
| 0.8863 | 160000 | 0.2641 |
|
1286 |
+
| 0.8891 | 160500 | 0.2642 |
|
1287 |
+
| 0.8919 | 161000 | 0.271 |
|
1288 |
+
| 0.8946 | 161500 | 0.2624 |
|
1289 |
+
| 0.8974 | 162000 | 0.2721 |
|
1290 |
+
| 0.9002 | 162500 | 0.2698 |
|
1291 |
+
| 0.9030 | 163000 | 0.2519 |
|
1292 |
+
| 0.9057 | 163500 | 0.2771 |
|
1293 |
+
| 0.9085 | 164000 | 0.2719 |
|
1294 |
+
| 0.9113 | 164500 | 0.2747 |
|
1295 |
+
| 0.9140 | 165000 | 0.28 |
|
1296 |
+
| 0.9168 | 165500 | 0.2618 |
|
1297 |
+
| 0.9196 | 166000 | 0.2755 |
|
1298 |
+
| 0.9223 | 166500 | 0.3104 |
|
1299 |
+
| 0.9251 | 167000 | 0.2671 |
|
1300 |
+
| 0.9279 | 167500 | 0.2491 |
|
1301 |
+
| 0.9307 | 168000 | 0.262 |
|
1302 |
+
| 0.9334 | 168500 | 0.2514 |
|
1303 |
+
| 0.9362 | 169000 | 0.2632 |
|
1304 |
+
| 0.9390 | 169500 | 0.2834 |
|
1305 |
+
| 0.9417 | 170000 | 0.2573 |
|
1306 |
+
| 0.9445 | 170500 | 0.2662 |
|
1307 |
+
| 0.9473 | 171000 | 0.2631 |
|
1308 |
+
| 0.9500 | 171500 | 0.2507 |
|
1309 |
+
| 0.9528 | 172000 | 0.2739 |
|
1310 |
+
| 0.9556 | 172500 | 0.2567 |
|
1311 |
+
| 0.9584 | 173000 | 0.2489 |
|
1312 |
+
| 0.9611 | 173500 | 0.2607 |
|
1313 |
+
| 0.9639 | 174000 | 0.2627 |
|
1314 |
+
| 0.9667 | 174500 | 0.2715 |
|
1315 |
+
| 0.9694 | 175000 | 0.2603 |
|
1316 |
+
| 0.9722 | 175500 | 0.2533 |
|
1317 |
+
| 0.9750 | 176000 | 0.261 |
|
1318 |
+
| 0.9777 | 176500 | 0.2485 |
|
1319 |
+
| 0.9805 | 177000 | 0.2719 |
|
1320 |
+
| 0.9833 | 177500 | 0.2693 |
|
1321 |
+
| 0.9861 | 178000 | 0.2825 |
|
1322 |
+
| 0.9888 | 178500 | 0.2697 |
|
1323 |
+
| 0.9916 | 179000 | 0.2601 |
|
1324 |
+
| 0.9944 | 179500 | 0.2459 |
|
1325 |
+
| 0.9971 | 180000 | 0.2674 |
|
1326 |
+
| 0.9999 | 180500 | 0.2725 |
|
1327 |
+
|
1328 |
+
</details>
|
1329 |
|
1330 |
### Framework Versions
|
1331 |
- Python: 3.11.4
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 596070136
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4f3e4eaefdf2c3a2062d343e925bad3c10166870ec2854b3733a6381b7f465e8
|
3 |
size 596070136
|