LeoChiuu commited on
Commit
9b9fcdb
1 Parent(s): ca4e4e8

Add new SentenceTransformer model.

Browse files
Files changed (3) hide show
  1. README.md +158 -160
  2. config_sentence_transformers.json +1 -1
  3. model.safetensors +1 -1
README.md CHANGED
@@ -1,7 +1,5 @@
1
  ---
2
  base_model: colorfulscoop/sbert-base-ja
3
- datasets: []
4
- language: []
5
  library_name: sentence-transformers
6
  metrics:
7
  - cosine_accuracy
@@ -45,34 +43,34 @@ tags:
45
  - sentence-similarity
46
  - feature-extraction
47
  - generated_from_trainer
48
- - dataset_size:228
49
- - loss:MultipleNegativesRankingLoss
50
  widget:
51
- - source_sentence: 家の外を探そう
52
  sentences:
53
- - ベットを調べよう
54
- - 何を作ったの?
55
- - 外を見てみよう
56
- - source_sentence: 物の姿を変える魔法が使える村人を知っている?
57
  sentences:
58
- - 中を見てみよう
59
- - ベッドにある?
60
- - 物体の形を変えられる魔法使いを知っている?
61
- - source_sentence: ぬいぐるみが花
62
  sentences:
63
- - リリアンはどんな呪文が使えるの?
64
- - ぬいぐるみ
65
- - 花がぬいぐるみに変えられている
66
- - source_sentence: ベッドにスカーフはある?
67
  sentences:
68
- - 井戸へ行ったことある?
69
- - どっちも要らない
70
- - スカーフはベッドにある?
71
- - source_sentence: キャンドル頂戴
72
  sentences:
73
- - 祭壇の些細な違和感ってなに?
74
- - やっぱり、キャンドルがいい
75
- - テーブルを調べよう
76
  model-index:
77
  - name: SentenceTransformer based on colorfulscoop/sbert-base-ja
78
  results:
@@ -80,119 +78,119 @@ model-index:
80
  type: binary-classification
81
  name: Binary Classification
82
  dataset:
83
- name: custom arc semantics data
84
- type: custom-arc-semantics-data
85
  metrics:
86
  - type: cosine_accuracy
87
- value: 0.9827586206896551
88
  name: Cosine Accuracy
89
  - type: cosine_accuracy_threshold
90
- value: 0.2341834306716919
91
  name: Cosine Accuracy Threshold
92
  - type: cosine_f1
93
- value: 0.9913043478260869
94
  name: Cosine F1
95
  - type: cosine_f1_threshold
96
- value: 0.2341834306716919
97
  name: Cosine F1 Threshold
98
  - type: cosine_precision
99
- value: 1.0
100
  name: Cosine Precision
101
  - type: cosine_recall
102
- value: 0.9827586206896551
103
  name: Cosine Recall
104
  - type: cosine_ap
105
- value: 1.0
106
  name: Cosine Ap
107
  - type: dot_accuracy
108
- value: 0.9827586206896551
109
  name: Dot Accuracy
110
  - type: dot_accuracy_threshold
111
- value: 134.29324340820312
112
  name: Dot Accuracy Threshold
113
  - type: dot_f1
114
- value: 0.9913043478260869
115
  name: Dot F1
116
  - type: dot_f1_threshold
117
- value: 134.29324340820312
118
  name: Dot F1 Threshold
119
  - type: dot_precision
120
- value: 1.0
121
  name: Dot Precision
122
  - type: dot_recall
123
- value: 0.9827586206896551
124
  name: Dot Recall
125
  - type: dot_ap
126
- value: 1.0
127
  name: Dot Ap
128
  - type: manhattan_accuracy
129
- value: 0.9827586206896551
130
  name: Manhattan Accuracy
131
  - type: manhattan_accuracy_threshold
132
- value: 644.1650390625
133
  name: Manhattan Accuracy Threshold
134
  - type: manhattan_f1
135
- value: 0.9913043478260869
136
  name: Manhattan F1
137
  - type: manhattan_f1_threshold
138
- value: 644.1650390625
139
  name: Manhattan F1 Threshold
140
  - type: manhattan_precision
141
- value: 1.0
142
  name: Manhattan Precision
143
  - type: manhattan_recall
144
- value: 0.9827586206896551
145
  name: Manhattan Recall
146
  - type: manhattan_ap
147
- value: 1.0
148
  name: Manhattan Ap
149
  - type: euclidean_accuracy
150
- value: 0.9827586206896551
151
  name: Euclidean Accuracy
152
  - type: euclidean_accuracy_threshold
153
- value: 29.542858123779297
154
  name: Euclidean Accuracy Threshold
155
  - type: euclidean_f1
156
- value: 0.9913043478260869
157
  name: Euclidean F1
158
  - type: euclidean_f1_threshold
159
- value: 29.542858123779297
160
  name: Euclidean F1 Threshold
161
  - type: euclidean_precision
162
- value: 1.0
163
  name: Euclidean Precision
164
  - type: euclidean_recall
165
- value: 0.9827586206896551
166
  name: Euclidean Recall
167
  - type: euclidean_ap
168
- value: 1.0
169
  name: Euclidean Ap
170
  - type: max_accuracy
171
- value: 0.9827586206896551
172
  name: Max Accuracy
173
  - type: max_accuracy_threshold
174
- value: 644.1650390625
175
  name: Max Accuracy Threshold
176
  - type: max_f1
177
- value: 0.9913043478260869
178
  name: Max F1
179
  - type: max_f1_threshold
180
- value: 644.1650390625
181
  name: Max F1 Threshold
182
  - type: max_precision
183
- value: 1.0
184
  name: Max Precision
185
  - type: max_recall
186
- value: 0.9827586206896551
187
  name: Max Recall
188
  - type: max_ap
189
- value: 1.0
190
  name: Max Ap
191
  ---
192
 
193
  # SentenceTransformer based on colorfulscoop/sbert-base-ja
194
 
195
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [colorfulscoop/sbert-base-ja](https://huggingface.co/colorfulscoop/sbert-base-ja). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
196
 
197
  ## Model Details
198
 
@@ -202,7 +200,8 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [c
202
  - **Maximum Sequence Length:** 512 tokens
203
  - **Output Dimensionality:** 768 tokens
204
  - **Similarity Function:** Cosine Similarity
205
- <!-- - **Training Dataset:** Unknown -->
 
206
  <!-- - **Language:** Unknown -->
207
  <!-- - **License:** Unknown -->
208
 
@@ -239,9 +238,9 @@ from sentence_transformers import SentenceTransformer
239
  model = SentenceTransformer("LeoChiuu/sbert-base-ja-arc")
240
  # Run inference
241
  sentences = [
242
- 'キャンドル頂戴',
243
- 'やっぱり、キャンドルがいい',
244
- 'テーブルを調べよう',
245
  ]
246
  embeddings = model.encode(sentences)
247
  print(embeddings.shape)
@@ -282,46 +281,46 @@ You can finetune this model on your own dataset.
282
  ### Metrics
283
 
284
  #### Binary Classification
285
- * Dataset: `custom-arc-semantics-data`
286
  * Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
287
 
288
- | Metric | Value |
289
- |:-----------------------------|:---------|
290
- | cosine_accuracy | 0.9828 |
291
- | cosine_accuracy_threshold | 0.2342 |
292
- | cosine_f1 | 0.9913 |
293
- | cosine_f1_threshold | 0.2342 |
294
- | cosine_precision | 1.0 |
295
- | cosine_recall | 0.9828 |
296
- | cosine_ap | 1.0 |
297
- | dot_accuracy | 0.9828 |
298
- | dot_accuracy_threshold | 134.2932 |
299
- | dot_f1 | 0.9913 |
300
- | dot_f1_threshold | 134.2932 |
301
- | dot_precision | 1.0 |
302
- | dot_recall | 0.9828 |
303
- | dot_ap | 1.0 |
304
- | manhattan_accuracy | 0.9828 |
305
- | manhattan_accuracy_threshold | 644.165 |
306
- | manhattan_f1 | 0.9913 |
307
- | manhattan_f1_threshold | 644.165 |
308
- | manhattan_precision | 1.0 |
309
- | manhattan_recall | 0.9828 |
310
- | manhattan_ap | 1.0 |
311
- | euclidean_accuracy | 0.9828 |
312
- | euclidean_accuracy_threshold | 29.5429 |
313
- | euclidean_f1 | 0.9913 |
314
- | euclidean_f1_threshold | 29.5429 |
315
- | euclidean_precision | 1.0 |
316
- | euclidean_recall | 0.9828 |
317
- | euclidean_ap | 1.0 |
318
- | max_accuracy | 0.9828 |
319
- | max_accuracy_threshold | 644.165 |
320
- | max_f1 | 0.9913 |
321
- | max_f1_threshold | 644.165 |
322
- | max_precision | 1.0 |
323
- | max_recall | 0.9828 |
324
- | **max_ap** | **1.0** |
325
 
326
  <!--
327
  ## Bias, Risks and Limitations
@@ -339,53 +338,53 @@ You can finetune this model on your own dataset.
339
 
340
  ### Training Dataset
341
 
342
- #### Unnamed Dataset
343
 
344
-
345
- * Size: 228 training samples
346
  * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
347
- * Approximate statistics based on the first 1000 samples:
348
- | | text1 | text2 | label |
349
- |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:-----------------------------|
350
- | type | string | string | int |
351
- | details | <ul><li>min: 4 tokens</li><li>mean: 8.28 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.63 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>1: 100.00%</li></ul> |
352
  * Samples:
353
- | text1 | text2 | label |
354
- |:----------------------------|:------------------------|:---------------|
355
- | <code>キャンドルを用意して</code> | <code>ロウソク</code> | <code>1</code> |
356
- | <code>なんで話せるの?</code> | <code>なんでしゃべれるの?</code> | <code>1</code> |
357
- | <code>それは物の見た目を変える魔法</code> | <code>物の見た目を変える</code> | <code>1</code> |
358
- * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
359
  ```json
360
  {
361
  "scale": 20.0,
362
- "similarity_fct": "cos_sim"
363
  }
364
  ```
365
 
366
  ### Evaluation Dataset
367
 
368
- #### Unnamed Dataset
369
-
370
 
371
- * Size: 58 evaluation samples
 
372
  * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
373
- * Approximate statistics based on the first 1000 samples:
374
- | | text1 | text2 | label |
375
- |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:-----------------------------|
376
- | type | string | string | int |
377
- | details | <ul><li>min: 4 tokens</li><li>mean: 8.33 tokens</li><li>max: 13 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.38 tokens</li><li>max: 13 tokens</li></ul> | <ul><li>1: 100.00%</li></ul> |
378
  * Samples:
379
- | text1 | text2 | label |
380
- |:----------------------------|:----------------------------|:---------------|
381
- | <code>雲より高くってどこ?</code> | <code>雲より高くってなに?</code> | <code>1</code> |
382
- | <code>気にスカーフがひっかかってる</code> | <code>キにスカーフが引っかかってる</code> | <code>1</code> |
383
- | <code>夕飯が辛かったから</code> | <code>夕飯に辛いスープを飲んだから</code> | <code>1</code> |
384
- * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
385
  ```json
386
  {
387
  "scale": 20.0,
388
- "similarity_fct": "cos_sim"
389
  }
390
  ```
391
 
@@ -517,30 +516,30 @@ You can finetune this model on your own dataset.
517
  </details>
518
 
519
  ### Training Logs
520
- | Epoch | Step | Training Loss | loss | custom-arc-semantics-data_max_ap |
521
- |:-----:|:----:|:-------------:|:------:|:--------------------------------:|
522
- | None | 0 | - | - | 1.0 |
523
- | 1.0 | 29 | 0.6181 | 0.3774 | 1.0 |
524
- | 2.0 | 58 | 0.2538 | 0.3356 | 1.0 |
525
- | 3.0 | 87 | 0.063 | 0.3885 | 1.0 |
526
- | 4.0 | 116 | 0.015 | 0.4536 | 1.0 |
527
- | 5.0 | 145 | 0.0061 | 0.4475 | 1.0 |
528
- | 6.0 | 174 | 0.002 | 0.4805 | 1.0 |
529
- | 7.0 | 203 | 0.0015 | 0.4826 | 1.0 |
530
- | 8.0 | 232 | 0.0012 | 0.4831 | 1.0 |
531
- | 9.0 | 261 | 0.0008 | 0.4848 | 1.0 |
532
- | 10.0 | 290 | 0.0006 | 0.4862 | 1.0 |
533
- | 11.0 | 319 | 0.0006 | 0.4883 | 1.0 |
534
- | 12.0 | 348 | 0.0007 | 0.4903 | 1.0 |
535
- | 13.0 | 377 | 0.0006 | 0.4912 | 1.0 |
536
 
537
 
538
  ### Framework Versions
539
  - Python: 3.10.14
540
- - Sentence Transformers: 3.0.1
541
  - Transformers: 4.44.2
542
  - PyTorch: 2.4.1+cu121
543
- - Accelerate: 0.34.0
544
  - Datasets: 2.20.0
545
  - Tokenizers: 0.19.1
546
 
@@ -561,15 +560,14 @@ You can finetune this model on your own dataset.
561
  }
562
  ```
563
 
564
- #### MultipleNegativesRankingLoss
565
  ```bibtex
566
- @misc{henderson2017efficient,
567
- title={Efficient Natural Language Response Suggestion for Smart Reply},
568
- author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
569
- year={2017},
570
- eprint={1705.00652},
571
- archivePrefix={arXiv},
572
- primaryClass={cs.CL}
573
  }
574
  ```
575
 
 
1
  ---
2
  base_model: colorfulscoop/sbert-base-ja
 
 
3
  library_name: sentence-transformers
4
  metrics:
5
  - cosine_accuracy
 
43
  - sentence-similarity
44
  - feature-extraction
45
  - generated_from_trainer
46
+ - dataset_size:601
47
+ - loss:CoSENTLoss
48
  widget:
49
+ - source_sentence: だれかが魔法で花をぬいぐるみに変えた
50
  sentences:
51
+ - 誰かが魔法の呪文で花をぬいぐるみに変えた
52
+ - 村長は誰?
53
+ - どこ?
54
+ - source_sentence: 暖炉にスカーフを置いた?
55
  sentences:
56
+ - 魔法をかけられる人
57
+ - ロウソク
58
+ - 晩ご飯のとき
59
+ - source_sentence: あほ
60
  sentences:
61
+ - 調子はどう?
62
+ - きらい
63
+ - オッケー
64
+ - source_sentence: 猫のぬいぐるみ
65
  sentences:
66
+ - 赤い染みが皿にあった
67
+ - 好きじゃないの?
68
+ - ぬいぐるみ
69
+ - source_sentence: リリアンはどんな呪文が使えるの?
70
  sentences:
71
+ - あなたは魔法使い?
72
+ - 姿かたちを変える魔法
73
+ - どのくらいのサイズ?
74
  model-index:
75
  - name: SentenceTransformer based on colorfulscoop/sbert-base-ja
76
  results:
 
78
  type: binary-classification
79
  name: Binary Classification
80
  dataset:
81
+ name: custom arc semantics data jp
82
+ type: custom-arc-semantics-data-jp
83
  metrics:
84
  - type: cosine_accuracy
85
+ value: 0.9090909090909091
86
  name: Cosine Accuracy
87
  - type: cosine_accuracy_threshold
88
+ value: 0.4785935878753662
89
  name: Cosine Accuracy Threshold
90
  - type: cosine_f1
91
+ value: 0.9341317365269461
92
  name: Cosine F1
93
  - type: cosine_f1_threshold
94
+ value: 0.4785935878753662
95
  name: Cosine F1 Threshold
96
  - type: cosine_precision
97
+ value: 0.9176470588235294
98
  name: Cosine Precision
99
  - type: cosine_recall
100
+ value: 0.9512195121951219
101
  name: Cosine Recall
102
  - type: cosine_ap
103
+ value: 0.9287829842425579
104
  name: Cosine Ap
105
  - type: dot_accuracy
106
+ value: 0.9008264462809917
107
  name: Dot Accuracy
108
  - type: dot_accuracy_threshold
109
+ value: 234.1079864501953
110
  name: Dot Accuracy Threshold
111
  - type: dot_f1
112
+ value: 0.9302325581395349
113
  name: Dot F1
114
  - type: dot_f1_threshold
115
+ value: 209.4735870361328
116
  name: Dot F1 Threshold
117
  - type: dot_precision
118
+ value: 0.8888888888888888
119
  name: Dot Precision
120
  - type: dot_recall
121
+ value: 0.975609756097561
122
  name: Dot Recall
123
  - type: dot_ap
124
+ value: 0.9635932205663708
125
  name: Dot Ap
126
  - type: manhattan_accuracy
127
+ value: 0.9008264462809917
128
  name: Manhattan Accuracy
129
  - type: manhattan_accuracy_threshold
130
+ value: 558.378173828125
131
  name: Manhattan Accuracy Threshold
132
  - type: manhattan_f1
133
+ value: 0.9302325581395349
134
  name: Manhattan F1
135
  - type: manhattan_f1_threshold
136
+ value: 580.81640625
137
  name: Manhattan F1 Threshold
138
  - type: manhattan_precision
139
+ value: 0.8888888888888888
140
  name: Manhattan Precision
141
  - type: manhattan_recall
142
+ value: 0.975609756097561
143
  name: Manhattan Recall
144
  - type: manhattan_ap
145
+ value: 0.92846470083454
146
  name: Manhattan Ap
147
  - type: euclidean_accuracy
148
+ value: 0.9090909090909091
149
  name: Euclidean Accuracy
150
  - type: euclidean_accuracy_threshold
151
+ value: 24.130870819091797
152
  name: Euclidean Accuracy Threshold
153
  - type: euclidean_f1
154
+ value: 0.9341317365269461
155
  name: Euclidean F1
156
  - type: euclidean_f1_threshold
157
+ value: 24.130870819091797
158
  name: Euclidean F1 Threshold
159
  - type: euclidean_precision
160
+ value: 0.9176470588235294
161
  name: Euclidean Precision
162
  - type: euclidean_recall
163
+ value: 0.9512195121951219
164
  name: Euclidean Recall
165
  - type: euclidean_ap
166
+ value: 0.9287963056027329
167
  name: Euclidean Ap
168
  - type: max_accuracy
169
+ value: 0.9090909090909091
170
  name: Max Accuracy
171
  - type: max_accuracy_threshold
172
+ value: 558.378173828125
173
  name: Max Accuracy Threshold
174
  - type: max_f1
175
+ value: 0.9341317365269461
176
  name: Max F1
177
  - type: max_f1_threshold
178
+ value: 580.81640625
179
  name: Max F1 Threshold
180
  - type: max_precision
181
+ value: 0.9176470588235294
182
  name: Max Precision
183
  - type: max_recall
184
+ value: 0.975609756097561
185
  name: Max Recall
186
  - type: max_ap
187
+ value: 0.9635932205663708
188
  name: Max Ap
189
  ---
190
 
191
  # SentenceTransformer based on colorfulscoop/sbert-base-ja
192
 
193
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [colorfulscoop/sbert-base-ja](https://huggingface.co/colorfulscoop/sbert-base-ja) on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
194
 
195
  ## Model Details
196
 
 
200
  - **Maximum Sequence Length:** 512 tokens
201
  - **Output Dimensionality:** 768 tokens
202
  - **Similarity Function:** Cosine Similarity
203
+ - **Training Dataset:**
204
+ - csv
205
  <!-- - **Language:** Unknown -->
206
  <!-- - **License:** Unknown -->
207
 
 
238
  model = SentenceTransformer("LeoChiuu/sbert-base-ja-arc")
239
  # Run inference
240
  sentences = [
241
+ 'リリアンはどんな呪文が使えるの?',
242
+ '姿かたちを変える魔法',
243
+ 'どのくらいのサイズ?',
244
  ]
245
  embeddings = model.encode(sentences)
246
  print(embeddings.shape)
 
281
  ### Metrics
282
 
283
  #### Binary Classification
284
+ * Dataset: `custom-arc-semantics-data-jp`
285
  * Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
286
 
287
+ | Metric | Value |
288
+ |:-----------------------------|:-----------|
289
+ | cosine_accuracy | 0.9091 |
290
+ | cosine_accuracy_threshold | 0.4786 |
291
+ | cosine_f1 | 0.9341 |
292
+ | cosine_f1_threshold | 0.4786 |
293
+ | cosine_precision | 0.9176 |
294
+ | cosine_recall | 0.9512 |
295
+ | cosine_ap | 0.9288 |
296
+ | dot_accuracy | 0.9008 |
297
+ | dot_accuracy_threshold | 234.108 |
298
+ | dot_f1 | 0.9302 |
299
+ | dot_f1_threshold | 209.4736 |
300
+ | dot_precision | 0.8889 |
301
+ | dot_recall | 0.9756 |
302
+ | dot_ap | 0.9636 |
303
+ | manhattan_accuracy | 0.9008 |
304
+ | manhattan_accuracy_threshold | 558.3782 |
305
+ | manhattan_f1 | 0.9302 |
306
+ | manhattan_f1_threshold | 580.8164 |
307
+ | manhattan_precision | 0.8889 |
308
+ | manhattan_recall | 0.9756 |
309
+ | manhattan_ap | 0.9285 |
310
+ | euclidean_accuracy | 0.9091 |
311
+ | euclidean_accuracy_threshold | 24.1309 |
312
+ | euclidean_f1 | 0.9341 |
313
+ | euclidean_f1_threshold | 24.1309 |
314
+ | euclidean_precision | 0.9176 |
315
+ | euclidean_recall | 0.9512 |
316
+ | euclidean_ap | 0.9288 |
317
+ | max_accuracy | 0.9091 |
318
+ | max_accuracy_threshold | 558.3782 |
319
+ | max_f1 | 0.9341 |
320
+ | max_f1_threshold | 580.8164 |
321
+ | max_precision | 0.9176 |
322
+ | max_recall | 0.9756 |
323
+ | **max_ap** | **0.9636** |
324
 
325
  <!--
326
  ## Bias, Risks and Limitations
 
338
 
339
  ### Training Dataset
340
 
341
+ #### csv
342
 
343
+ * Dataset: csv
344
+ * Size: 601 training samples
345
  * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
346
+ * Approximate statistics based on the first 601 samples:
347
+ | | text1 | text2 | label |
348
+ |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
349
+ | type | string | string | int |
350
+ | details | <ul><li>min: 4 tokens</li><li>mean: 7.99 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.05 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>0: ~33.96%</li><li>1: ~66.04%</li></ul> |
351
  * Samples:
352
+ | text1 | text2 | label |
353
+ |:------------------------|:----------------------|:---------------|
354
+ | <code>どっちがいいと思う?</code> | <code>どっちが欲しい?</code> | <code>1</code> |
355
+ | <code>かわいいね</code> | <code>ばか</code> | <code>0</code> |
356
+ | <code>別のは選べないの?</code> | <code>なにが欲しい?</code> | <code>0</code> |
357
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
358
  ```json
359
  {
360
  "scale": 20.0,
361
+ "similarity_fct": "pairwise_cos_sim"
362
  }
363
  ```
364
 
365
  ### Evaluation Dataset
366
 
367
+ #### csv
 
368
 
369
+ * Dataset: csv
370
+ * Size: 601 evaluation samples
371
  * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
372
+ * Approximate statistics based on the first 601 samples:
373
+ | | text1 | text2 | label |
374
+ |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
375
+ | type | string | string | int |
376
+ | details | <ul><li>min: 4 tokens</li><li>mean: 8.26 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 7.94 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>0: ~32.23%</li><li>1: ~67.77%</li></ul> |
377
  * Samples:
378
+ | text1 | text2 | label |
379
+ |:-----------------------|:------------------------|:---------------|
380
+ | <code>誰かが魔法を使った</code> | <code>誰かがが魔法をかけた</code> | <code>1</code> |
381
+ | <code>これが花</code> | <code>ぬいぐるみが花</code> | <code>1</code> |
382
+ | <code>夜ご飯を作る前</code> | <code>夜ご飯を食べる前</code> | <code>1</code> |
383
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
384
  ```json
385
  {
386
  "scale": 20.0,
387
+ "similarity_fct": "pairwise_cos_sim"
388
  }
389
  ```
390
 
 
516
  </details>
517
 
518
  ### Training Logs
519
+ | Epoch | Step | Training Loss | loss | custom-arc-semantics-data-jp_max_ap |
520
+ |:-------:|:----:|:-------------:|:------:|:-----------------------------------:|
521
+ | None | 0 | - | - | 0.8596 |
522
+ | 1.0167 | 61 | 2.775 | 2.0852 | 0.8927 |
523
+ | 2.0167 | 122 | 1.213 | 1.7433 | 0.9291 |
524
+ | 3.0167 | 183 | 0.5703 | 1.5724 | 0.9379 |
525
+ | 4.0167 | 244 | 0.4603 | 1.6239 | 0.9432 |
526
+ | 5.0167 | 305 | 0.3672 | 1.6444 | 0.9523 |
527
+ | 6.0167 | 366 | 0.2947 | 1.6222 | 0.9603 |
528
+ | 7.0167 | 427 | 0.2255 | 1.7302 | 0.9619 |
529
+ | 8.0167 | 488 | 0.1678 | 1.7360 | 0.9633 |
530
+ | 9.0167 | 549 | 0.1163 | 1.8029 | 0.9620 |
531
+ | 10.0167 | 610 | 0.0706 | 1.8986 | 0.9639 |
532
+ | 11.0167 | 671 | 0.0389 | 1.9671 | 0.9624 |
533
+ | 12.0167 | 732 | 0.0333 | 2.0375 | 0.9636 |
534
+ | 12.8 | 780 | 0.0618 | 1.9938 | 0.9636 |
535
 
536
 
537
  ### Framework Versions
538
  - Python: 3.10.14
539
+ - Sentence Transformers: 3.1.0
540
  - Transformers: 4.44.2
541
  - PyTorch: 2.4.1+cu121
542
+ - Accelerate: 0.34.2
543
  - Datasets: 2.20.0
544
  - Tokenizers: 0.19.1
545
 
 
560
  }
561
  ```
562
 
563
+ #### CoSENTLoss
564
  ```bibtex
565
+ @online{kexuefm-8847,
566
+ title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
567
+ author={Su Jianlin},
568
+ year={2022},
569
+ month={Jan},
570
+ url={https://kexue.fm/archives/8847},
 
571
  }
572
  ```
573
 
config_sentence_transformers.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "__version__": {
3
- "sentence_transformers": "3.0.1",
4
  "transformers": "4.44.2",
5
  "pytorch": "2.4.1+cu121"
6
  },
 
1
  {
2
  "__version__": {
3
+ "sentence_transformers": "3.1.0",
4
  "transformers": "4.44.2",
5
  "pytorch": "2.4.1+cu121"
6
  },
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:039721c1fad0c0fa6d3c342ca79d7eb552b0005e9c34c0bcc96c0455e340a82d
3
  size 442491744
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0900bde2010bc7e2818d70b31e4bbe7106bdb90d812cf99c8f8921b69fb1d8f2
3
  size 442491744