LeoChiuu commited on
Commit
68453ee
1 Parent(s): 10337ad

Add new SentenceTransformer model.

Browse files
Files changed (2) hide show
  1. README.md +142 -38
  2. model.safetensors +1 -1
README.md CHANGED
@@ -6,40 +6,43 @@ tags:
6
  - sentence-similarity
7
  - feature-extraction
8
  - generated_from_trainer
9
- - dataset_size:75
10
  - loss:CosineSimilarityLoss
11
  base_model: sentence-transformers/all-MiniLM-L6-v2
12
  datasets: []
13
  widget:
14
- - source_sentence: The Turkish language is an official language of the republic.
 
15
  sentences:
16
- - The problem of head-of-line blocking using Virtual Output Queues was discussed
17
- in the paper.
18
- - He had also written Sonnets in Urdu.
19
- - The Turkish language is an official language of the republic, alongside the Greek
20
- language.
21
- - source_sentence: Spinal locks and cervical locks are allowed and mandatory in IBJJF
22
- Brazilian jiu-jitsu competitions.
23
  sentences:
24
- - He was the head baseball coach at Louisiana State University.
25
- - Spinal locks and cervical locks are forbidden in IBJJF Brazilian jiu-jitsu competitions.
26
- - It must be taken at the start of the main meal.
27
- - source_sentence: The blood samples tested positive for sarin.
 
 
28
  sentences:
29
- - The blood samples tested negative for sarin.
30
- - It was an inexpensive piece, but I would still have expected better quality.
31
- - Additionally, a church at San Lazaro in Orange Walk District suffered severe damage.
32
- - source_sentence: Several small groups claim to continue to practice this faith.
 
 
33
  sentences:
34
- - He only extensively held a position of Visiting Lecturer at UCLA in 1967.
35
- - Six other dams were unsuccessful that day, two were small and four were minor
36
- in size.
37
- - All tiny groups claim to continue to practice this faith.
38
- - source_sentence: The wings are diffuse with scales.
39
  sentences:
40
- - The wings are pale greenish brown, diffused with blackish scales.
41
- - The theory has been rejected by other researchers.
42
- - His views were in a minority at the Westminster Assembly.
 
43
  pipeline_tag: sentence-similarity
44
  ---
45
 
@@ -93,9 +96,9 @@ from sentence_transformers import SentenceTransformer
93
  model = SentenceTransformer("LeoChiuu/all-MiniLM-L6-v2-negations")
94
  # Run inference
95
  sentences = [
96
- 'The wings are diffuse with scales.',
97
- 'The wings are pale greenish brown, diffused with blackish scales.',
98
- 'His views were in a minority at the Westminster Assembly.',
99
  ]
100
  embeddings = model.encode(sentences)
101
  print(embeddings.shape)
@@ -150,19 +153,19 @@ You can finetune this model on your own dataset.
150
  #### Unnamed Dataset
151
 
152
 
153
- * Size: 75 training samples
154
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
155
  * Approximate statistics based on the first 1000 samples:
156
- | | sentence_0 | sentence_1 | label |
157
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
158
- | type | string | string | int |
159
- | details | <ul><li>min: 9 tokens</li><li>mean: 16.36 tokens</li><li>max: 39 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 16.55 tokens</li><li>max: 43 tokens</li></ul> | <ul><li>0: ~61.33%</li><li>1: ~38.67%</li></ul> |
160
  * Samples:
161
- | sentence_0 | sentence_1 | label |
162
- |:------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:---------------|
163
- | <code>It is just very clean when burned.</code> | <code>It is also very dirty when burned.</code> | <code>0</code> |
164
- | <code>The blood samples tested positive for sarin.</code> | <code>The blood samples tested negative for sarin.</code> | <code>0</code> |
165
- | <code>The species is named in honor of the marriage of Sara Anderson and Malcolm Slaney.</code> | <code>The species is named in honor of the divorce of Sara Anderson and Malcolm Slaney.</code> | <code>0</code> |
166
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
167
  ```json
168
  {
@@ -289,6 +292,107 @@ You can finetune this model on your own dataset.
289
 
290
  </details>
291
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
292
  ### Framework Versions
293
  - Python: 3.11.9
294
  - Sentence Transformers: 3.0.1
 
6
  - sentence-similarity
7
  - feature-extraction
8
  - generated_from_trainer
9
+ - dataset_size:77376
10
  - loss:CosineSimilarityLoss
11
  base_model: sentence-transformers/all-MiniLM-L6-v2
12
  datasets: []
13
  widget:
14
+ - source_sentence: He has published several books on nutrition, trace metals but not
15
+ biochemistry imbalances.
16
  sentences:
17
+ - This in turn can help in effective communication between healthcare providers
18
+ and their patients.
19
+ - He has written several books on nutrition, trace metals, and biochemistry imbalances.
20
+ - One of the most boring movies I have ever seen.
21
+ - source_sentence: She was denied the 2011 NSK Neustadt Prize for Children's Literature.
 
 
22
  sentences:
23
+ - She was the recipient of the 2011 NSK Neustadt Prize for Children's Literature.
24
+ - The ancient woodland at Dickshills is also located here.
25
+ - An element (such as a tree) that contributes to evapotranspiration can be called
26
+ an evapotranspirator.
27
+ - source_sentence: Viking, after the resemblance the pitchers bear to the prow of
28
+ a Viking ship.
29
  sentences:
30
+ - Viking, after the striking difference the pitchers bear to the prow of a Viking
31
+ ship.
32
+ - Honshu is formed from the island arcs.
33
+ - For instance, even alcohol consumption by a pregnant woman is unable to lead to
34
+ fetal alcohol syndrome.
35
+ - source_sentence: Logging has not been undertake near the headwaters of the creek.
36
  sentences:
37
+ - Then I had to continue pairing it periodically since it somehow kept dropping.
38
+ - That's fair, Nance.
39
+ - Logging has been done near the headwaters of the creek.
40
+ - source_sentence: He published a history of Cornwall, New York in 1873.
 
41
  sentences:
42
+ - He failed to publish a history of Cornwall, New York in 1873.
43
+ - Salafis assert that reliance on taqlid has led to Islam 's decline.
44
+ - 'Lot of holes in the plot: there''s nothing about how he became the emperor; nothing
45
+ about where he spend 20 years between his childhood and mature age.'
46
  pipeline_tag: sentence-similarity
47
  ---
48
 
 
96
  model = SentenceTransformer("LeoChiuu/all-MiniLM-L6-v2-negations")
97
  # Run inference
98
  sentences = [
99
+ 'He published a history of Cornwall, New York in 1873.',
100
+ 'He failed to publish a history of Cornwall, New York in 1873.',
101
+ "Salafis assert that reliance on taqlid has led to Islam 's decline.",
102
  ]
103
  embeddings = model.encode(sentences)
104
  print(embeddings.shape)
 
153
  #### Unnamed Dataset
154
 
155
 
156
+ * Size: 77,376 training samples
157
  * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
158
  * Approximate statistics based on the first 1000 samples:
159
+ | | sentence_0 | sentence_1 | label |
160
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
161
+ | type | string | string | int |
162
+ | details | <ul><li>min: 6 tokens</li><li>mean: 16.2 tokens</li><li>max: 57 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 16.32 tokens</li><li>max: 56 tokens</li></ul> | <ul><li>0: ~53.20%</li><li>1: ~46.80%</li></ul> |
163
  * Samples:
164
+ | sentence_0 | sentence_1 | label |
165
+ |:--------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------|:---------------|
166
+ | <code>The situation in Yemen was already much better than it was in Bahrain.</code> | <code>The situation in Yemen was not much better than Bahrain.</code> | <code>0</code> |
167
+ | <code>She was a member of the Gamma Theta Upsilon honour society of geography.</code> | <code>She was denied membership of the Gamma Theta Upsilon honour society of mathematics.</code> | <code>0</code> |
168
+ | <code>Which aren't small and not worth the price.</code> | <code>Which are small and not worth the price.</code> | <code>0</code> |
169
  * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
170
  ```json
171
  {
 
292
 
293
  </details>
294
 
295
+ ### Training Logs
296
+ | Epoch | Step | Training Loss |
297
+ |:------:|:-----:|:-------------:|
298
+ | 0.1034 | 500 | 0.3382 |
299
+ | 0.2068 | 1000 | 0.2112 |
300
+ | 0.3102 | 1500 | 0.1649 |
301
+ | 0.4136 | 2000 | 0.1454 |
302
+ | 0.5170 | 2500 | 0.1244 |
303
+ | 0.6203 | 3000 | 0.1081 |
304
+ | 0.7237 | 3500 | 0.0962 |
305
+ | 0.8271 | 4000 | 0.0924 |
306
+ | 0.9305 | 4500 | 0.0852 |
307
+ | 1.0339 | 5000 | 0.0812 |
308
+ | 1.1373 | 5500 | 0.0833 |
309
+ | 1.2407 | 6000 | 0.0736 |
310
+ | 1.3441 | 6500 | 0.0756 |
311
+ | 1.4475 | 7000 | 0.0665 |
312
+ | 1.5509 | 7500 | 0.0661 |
313
+ | 1.6543 | 8000 | 0.0625 |
314
+ | 1.7577 | 8500 | 0.0621 |
315
+ | 1.8610 | 9000 | 0.0593 |
316
+ | 1.9644 | 9500 | 0.054 |
317
+ | 2.0678 | 10000 | 0.0569 |
318
+ | 2.1712 | 10500 | 0.0566 |
319
+ | 2.2746 | 11000 | 0.0502 |
320
+ | 2.3780 | 11500 | 0.0516 |
321
+ | 2.4814 | 12000 | 0.0455 |
322
+ | 2.5848 | 12500 | 0.0454 |
323
+ | 2.6882 | 13000 | 0.0424 |
324
+ | 2.7916 | 13500 | 0.044 |
325
+ | 2.8950 | 14000 | 0.0376 |
326
+ | 2.9983 | 14500 | 0.0386 |
327
+ | 3.1017 | 15000 | 0.0392 |
328
+ | 3.2051 | 15500 | 0.0344 |
329
+ | 3.3085 | 16000 | 0.0348 |
330
+ | 3.4119 | 16500 | 0.0343 |
331
+ | 3.5153 | 17000 | 0.0322 |
332
+ | 3.6187 | 17500 | 0.0324 |
333
+ | 3.7221 | 18000 | 0.0278 |
334
+ | 3.8255 | 18500 | 0.0294 |
335
+ | 3.9289 | 19000 | 0.0292 |
336
+ | 4.0323 | 19500 | 0.0276 |
337
+ | 4.1356 | 20000 | 0.0285 |
338
+ | 4.2390 | 20500 | 0.026 |
339
+ | 4.3424 | 21000 | 0.0271 |
340
+ | 4.4458 | 21500 | 0.0248 |
341
+ | 4.5492 | 22000 | 0.0245 |
342
+ | 4.6526 | 22500 | 0.0253 |
343
+ | 4.7560 | 23000 | 0.022 |
344
+ | 4.8594 | 23500 | 0.0219 |
345
+ | 4.9628 | 24000 | 0.0207 |
346
+ | 5.0662 | 24500 | 0.0212 |
347
+ | 5.1696 | 25000 | 0.0218 |
348
+ | 5.2730 | 25500 | 0.0192 |
349
+ | 5.3763 | 26000 | 0.0198 |
350
+ | 5.4797 | 26500 | 0.0183 |
351
+ | 5.5831 | 27000 | 0.02 |
352
+ | 5.6865 | 27500 | 0.0176 |
353
+ | 5.7899 | 28000 | 0.0184 |
354
+ | 5.8933 | 28500 | 0.0157 |
355
+ | 5.9967 | 29000 | 0.0175 |
356
+ | 6.1001 | 29500 | 0.0175 |
357
+ | 6.2035 | 30000 | 0.0163 |
358
+ | 6.3069 | 30500 | 0.0173 |
359
+ | 6.4103 | 31000 | 0.0165 |
360
+ | 6.5136 | 31500 | 0.0152 |
361
+ | 6.6170 | 32000 | 0.0155 |
362
+ | 6.7204 | 32500 | 0.0132 |
363
+ | 6.8238 | 33000 | 0.0147 |
364
+ | 6.9272 | 33500 | 0.0145 |
365
+ | 7.0306 | 34000 | 0.014 |
366
+ | 7.1340 | 34500 | 0.0147 |
367
+ | 7.2374 | 35000 | 0.0126 |
368
+ | 7.3408 | 35500 | 0.0141 |
369
+ | 7.4442 | 36000 | 0.0127 |
370
+ | 7.5476 | 36500 | 0.0132 |
371
+ | 7.6510 | 37000 | 0.0125 |
372
+ | 7.7543 | 37500 | 0.0111 |
373
+ | 7.8577 | 38000 | 0.011 |
374
+ | 7.9611 | 38500 | 0.0125 |
375
+ | 8.0645 | 39000 | 0.0128 |
376
+ | 8.1679 | 39500 | 0.013 |
377
+ | 8.2713 | 40000 | 0.0115 |
378
+ | 8.3747 | 40500 | 0.0111 |
379
+ | 8.4781 | 41000 | 0.0108 |
380
+ | 8.5815 | 41500 | 0.012 |
381
+ | 8.6849 | 42000 | 0.0108 |
382
+ | 8.7883 | 42500 | 0.0105 |
383
+ | 8.8916 | 43000 | 0.0092 |
384
+ | 8.9950 | 43500 | 0.0115 |
385
+ | 9.0984 | 44000 | 0.0112 |
386
+ | 9.2018 | 44500 | 0.0096 |
387
+ | 9.3052 | 45000 | 0.0106 |
388
+ | 9.4086 | 45500 | 0.011 |
389
+ | 9.5120 | 46000 | 0.01 |
390
+ | 9.6154 | 46500 | 0.011 |
391
+ | 9.7188 | 47000 | 0.0097 |
392
+ | 9.8222 | 47500 | 0.0096 |
393
+ | 9.9256 | 48000 | 0.0102 |
394
+
395
+
396
  ### Framework Versions
397
  - Python: 3.11.9
398
  - Sentence Transformers: 3.0.1
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d1e850db87326cab13d251334d8aee003b996ef914ddf45950fc06f56a999f81
3
  size 90864192
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2a07b72204ce2d0731a19bf791dcf150885596166477cc4f82599c32fbf07f63
3
  size 90864192