Add new SentenceTransformer model.

Browse files

Files changed (3) hide show

README.md +158 -160
config_sentence_transformers.json +1 -1
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -1,7 +1,5 @@
 ---
 base_model: colorfulscoop/sbert-base-ja
-datasets: []
-language: []
 library_name: sentence-transformers
 metrics:
 - cosine_accuracy
@@ -45,34 +43,34 @@ tags:
 - sentence-similarity
 - feature-extraction
 - generated_from_trainer
-- dataset_size:228
-- loss:MultipleNegativesRankingLoss
 widget:
-- source_sentence: 家の外を探そう
   sentences:
-  - ベットを調べよう
-  - 何を作ったの？
-  - 外を見てみよう
-- source_sentence: 物の姿を変える魔法が使える村人を知っている？
   sentences:
-  - 中を見てみよう
-  - ベッドにある？
-  - 物体の形を変えられる魔法使いを知っている？
-- source_sentence: ぬいぐるみが花
   sentences:
-  - リリアンはどんな呪文が使えるの？
-  - ぬいぐるみ
-  - 花がぬいぐるみに変えられている
-- source_sentence: ベッドにスカーフはある？
   sentences:
-  - 井戸へ行ったことある？
-  - どっちも要らない
-  - スカーフはベッドにある？
-- source_sentence: キャンドル頂戴
   sentences:
-  - 祭壇の些細な違和感ってなに？
-  - やっぱり、キャンドルがいい
-  - テーブルを調べよう
 model-index:
 - name: SentenceTransformer based on colorfulscoop/sbert-base-ja
   results:
@@ -80,119 +78,119 @@ model-index:
       type: binary-classification
       name: Binary Classification
     dataset:
-      name: custom arc semantics data
-      type: custom-arc-semantics-data
     metrics:
     - type: cosine_accuracy
-      value: 0.9827586206896551
       name: Cosine Accuracy
     - type: cosine_accuracy_threshold
-      value: 0.2341834306716919
       name: Cosine Accuracy Threshold
     - type: cosine_f1
-      value: 0.9913043478260869
       name: Cosine F1
     - type: cosine_f1_threshold
-      value: 0.2341834306716919
       name: Cosine F1 Threshold
     - type: cosine_precision
-      value: 1.0
       name: Cosine Precision
     - type: cosine_recall
-      value: 0.9827586206896551
       name: Cosine Recall
     - type: cosine_ap
-      value: 1.0
       name: Cosine Ap
     - type: dot_accuracy
-      value: 0.9827586206896551
       name: Dot Accuracy
     - type: dot_accuracy_threshold
-      value: 134.29324340820312
       name: Dot Accuracy Threshold
     - type: dot_f1
-      value: 0.9913043478260869
       name: Dot F1
     - type: dot_f1_threshold
-      value: 134.29324340820312
       name: Dot F1 Threshold
     - type: dot_precision
-      value: 1.0
       name: Dot Precision
     - type: dot_recall
-      value: 0.9827586206896551
       name: Dot Recall
     - type: dot_ap
-      value: 1.0
       name: Dot Ap
     - type: manhattan_accuracy
-      value: 0.9827586206896551
       name: Manhattan Accuracy
     - type: manhattan_accuracy_threshold
-      value: 644.1650390625
       name: Manhattan Accuracy Threshold
     - type: manhattan_f1
-      value: 0.9913043478260869
       name: Manhattan F1
     - type: manhattan_f1_threshold
-      value: 644.1650390625
       name: Manhattan F1 Threshold
     - type: manhattan_precision
-      value: 1.0
       name: Manhattan Precision
     - type: manhattan_recall
-      value: 0.9827586206896551
       name: Manhattan Recall
     - type: manhattan_ap
-      value: 1.0
       name: Manhattan Ap
     - type: euclidean_accuracy
-      value: 0.9827586206896551
       name: Euclidean Accuracy
     - type: euclidean_accuracy_threshold
-      value: 29.542858123779297
       name: Euclidean Accuracy Threshold
     - type: euclidean_f1
-      value: 0.9913043478260869
       name: Euclidean F1
     - type: euclidean_f1_threshold
-      value: 29.542858123779297
       name: Euclidean F1 Threshold
     - type: euclidean_precision
-      value: 1.0
       name: Euclidean Precision
     - type: euclidean_recall
-      value: 0.9827586206896551
       name: Euclidean Recall
     - type: euclidean_ap
-      value: 1.0
       name: Euclidean Ap
     - type: max_accuracy
-      value: 0.9827586206896551
       name: Max Accuracy
     - type: max_accuracy_threshold
-      value: 644.1650390625
       name: Max Accuracy Threshold
     - type: max_f1
-      value: 0.9913043478260869
       name: Max F1
     - type: max_f1_threshold
-      value: 644.1650390625
       name: Max F1 Threshold
     - type: max_precision
-      value: 1.0
       name: Max Precision
     - type: max_recall
-      value: 0.9827586206896551
       name: Max Recall
     - type: max_ap
-      value: 1.0
       name: Max Ap
 ---
 # SentenceTransformer based on colorfulscoop/sbert-base-ja
-This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [colorfulscoop/sbert-base-ja](https://huggingface.co/colorfulscoop/sbert-base-ja). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
@@ -202,7 +200,8 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [c
 - **Maximum Sequence Length:** 512 tokens
 - **Output Dimensionality:** 768 tokens
 - **Similarity Function:** Cosine Similarity
-<!-- - **Training Dataset:** Unknown -->
 <!-- - **Language:** Unknown -->
 <!-- - **License:** Unknown -->
@@ -239,9 +238,9 @@ from sentence_transformers import SentenceTransformer
 model = SentenceTransformer("LeoChiuu/sbert-base-ja-arc")
 # Run inference
 sentences = [
-    'キャンドル頂戴',
-    'やっぱり、キャンドルがいい',
-    'テーブルを調べよう',
 ]
 embeddings = model.encode(sentences)
 print(embeddings.shape)
@@ -282,46 +281,46 @@ You can finetune this model on your own dataset.
 ### Metrics
 #### Binary Classification
-* Dataset: `custom-arc-semantics-data`
 * Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
-| Metric                       | Value    |
-|:-----------------------------|:---------|
-| cosine_accuracy              | 0.9828   |
-| cosine_accuracy_threshold    | 0.2342   |
-| cosine_f1                    | 0.9913   |
-| cosine_f1_threshold          | 0.2342   |
-| cosine_precision             | 1.0      |
-| cosine_recall                | 0.9828   |
-| cosine_ap                    | 1.0      |
-| dot_accuracy                 | 0.9828   |
-| dot_accuracy_threshold       | 134.2932 |
-| dot_f1                       | 0.9913   |
-| dot_f1_threshold             | 134.2932 |
-| dot_precision                | 1.0      |
-| dot_recall                   | 0.9828   |
-| dot_ap                       | 1.0      |
-| manhattan_accuracy           | 0.9828   |
-| manhattan_accuracy_threshold | 644.165  |
-| manhattan_f1                 | 0.9913   |
-| manhattan_f1_threshold       | 644.165  |
-| manhattan_precision          | 1.0      |
-| manhattan_recall             | 0.9828   |
-| manhattan_ap                 | 1.0      |
-| euclidean_accuracy           | 0.9828   |
-| euclidean_accuracy_threshold | 29.5429  |
-| euclidean_f1                 | 0.9913   |
-| euclidean_f1_threshold       | 29.5429  |
-| euclidean_precision          | 1.0      |
-| euclidean_recall             | 0.9828   |
-| euclidean_ap                 | 1.0      |
-| max_accuracy                 | 0.9828   |
-| max_accuracy_threshold       | 644.165  |
-| max_f1                       | 0.9913   |
-| max_f1_threshold             | 644.165  |
-| max_precision                | 1.0      |
-| max_recall                   | 0.9828   |
-| **max_ap**                   | **1.0**  |
 <!--
 ## Bias, Risks and Limitations
@@ -339,53 +338,53 @@ You can finetune this model on your own dataset.
 ### Training Dataset
-#### Unnamed Dataset
-* Size: 228 training samples
 * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
-* Approximate statistics based on the first 1000 samples:
-  |         | text1                                                                            | text2                                                                            | label                        |
-  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:-----------------------------|
-  | type    | string                                                                           | string                                                                           | int                          |
-  | details | <ul><li>min: 4 tokens</li><li>mean: 8.28 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.63 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>1: 100.00%</li></ul> |
 * Samples:
-  | text1                       | text2                   | label          |
-  |:----------------------------|:------------------------|:---------------|
-  | <code>キャンドルを用意して</code>     | <code>ロウソク</code>       | <code>1</code> |
-  | <code>なんで話せるの？</code>       | <code>なんでしゃべれるの？</code> | <code>1</code> |
-  | <code>それは物の見た目を変える魔法</code> | <code>物の見た目を変える</code>  | <code>1</code> |
-* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
   ```json
   {
       "scale": 20.0,
-      "similarity_fct": "cos_sim"
   }
   ```
 ### Evaluation Dataset
-#### Unnamed Dataset
-* Size: 58 evaluation samples
 * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
-* Approximate statistics based on the first 1000 samples:
-  |         | text1                                                                            | text2                                                                            | label                        |
-  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:-----------------------------|
-  | type    | string                                                                           | string                                                                           | int                          |
-  | details | <ul><li>min: 4 tokens</li><li>mean: 8.33 tokens</li><li>max: 13 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.38 tokens</li><li>max: 13 tokens</li></ul> | <ul><li>1: 100.00%</li></ul> |
 * Samples:
-  | text1                       | text2                       | label          |
-  |:----------------------------|:----------------------------|:---------------|
-  | <code>雲より高くってどこ？</code>     | <code>雲より高くってなに？</code>     | <code>1</code> |
-  | <code>気にスカーフがひっかかってる</code> | <code>キにスカーフが引っかかってる</code> | <code>1</code> |
-  | <code>夕飯が辛かったから</code>      | <code>夕飯に辛いスープを飲んだから</code> | <code>1</code> |
-* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
   ```json
   {
       "scale": 20.0,
-      "similarity_fct": "cos_sim"
   }
   ```
@@ -517,30 +516,30 @@ You can finetune this model on your own dataset.
 </details>
 ### Training Logs
-| Epoch | Step | Training Loss | loss   | custom-arc-semantics-data_max_ap |
-|:-----:|:----:|:-------------:|:------:|:--------------------------------:|
-| None  | 0    | -             | -      | 1.0                              |
-| 1.0   | 29   | 0.6181        | 0.3774 | 1.0                              |
-| 2.0   | 58   | 0.2538        | 0.3356 | 1.0                              |
-| 3.0   | 87   | 0.063         | 0.3885 | 1.0                              |
-| 4.0   | 116  | 0.015         | 0.4536 | 1.0                              |
-| 5.0   | 145  | 0.0061        | 0.4475 | 1.0                              |
-| 6.0   | 174  | 0.002         | 0.4805 | 1.0                              |
-| 7.0   | 203  | 0.0015        | 0.4826 | 1.0                              |
-| 8.0   | 232  | 0.0012        | 0.4831 | 1.0                              |
-| 9.0   | 261  | 0.0008        | 0.4848 | 1.0                              |
-| 10.0  | 290  | 0.0006        | 0.4862 | 1.0                              |
-| 11.0  | 319  | 0.0006        | 0.4883 | 1.0                              |
-| 12.0  | 348  | 0.0007        | 0.4903 | 1.0                              |
-| 13.0  | 377  | 0.0006        | 0.4912 | 1.0                              |
 ### Framework Versions
 - Python: 3.10.14
-- Sentence Transformers: 3.0.1
 - Transformers: 4.44.2
 - PyTorch: 2.4.1+cu121
-- Accelerate: 0.34.0
 - Datasets: 2.20.0
 - Tokenizers: 0.19.1
@@ -561,15 +560,14 @@ You can finetune this model on your own dataset.
 }
 ```
-#### MultipleNegativesRankingLoss
 ```bibtex
-@misc{henderson2017efficient,
-    title={Efficient Natural Language Response Suggestion for Smart Reply},
-    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
-    year={2017},
-    eprint={1705.00652},
-    archivePrefix={arXiv},
-    primaryClass={cs.CL}
 }
 ```

 ---
 base_model: colorfulscoop/sbert-base-ja
 library_name: sentence-transformers
 metrics:
 - cosine_accuracy
 - sentence-similarity
 - feature-extraction
 - generated_from_trainer
+- dataset_size:601
+- loss:CoSENTLoss
 widget:
+- source_sentence: だれかが魔法で花をぬいぐるみに変えた
   sentences:
+  - 誰かが魔法の呪文で花をぬいぐるみに変えた
+  - 村長は誰？
+  - どこ？
+- source_sentence: 暖炉にスカーフを置いた？
   sentences:
+  - 魔法をかけられる人
+  - ロウソク
+  - 晩ご飯のとき
+- source_sentence: あほ
   sentences:
+  - 調子はどう？
+  - きらい
+  - オッケー
+- source_sentence: 猫のぬいぐるみ
   sentences:
+  - 赤い染みが皿にあった
+  - 好きじゃないの？
+  - ぬいぐるみ
+- source_sentence: リリアンはどんな呪文が使えるの？
   sentences:
+  - あなたは魔法使い？
+  - 姿かたちを変える魔法
+  - どのくらいのサイズ？
 model-index:
 - name: SentenceTransformer based on colorfulscoop/sbert-base-ja
   results:
       type: binary-classification
       name: Binary Classification
     dataset:
+      name: custom arc semantics data jp
+      type: custom-arc-semantics-data-jp
     metrics:
     - type: cosine_accuracy
+      value: 0.9090909090909091
       name: Cosine Accuracy
     - type: cosine_accuracy_threshold
+      value: 0.4785935878753662
       name: Cosine Accuracy Threshold
     - type: cosine_f1
+      value: 0.9341317365269461
       name: Cosine F1
     - type: cosine_f1_threshold
+      value: 0.4785935878753662
       name: Cosine F1 Threshold
     - type: cosine_precision
+      value: 0.9176470588235294
       name: Cosine Precision
     - type: cosine_recall
+      value: 0.9512195121951219
       name: Cosine Recall
     - type: cosine_ap
+      value: 0.9287829842425579
       name: Cosine Ap
     - type: dot_accuracy
+      value: 0.9008264462809917
       name: Dot Accuracy
     - type: dot_accuracy_threshold
+      value: 234.1079864501953
       name: Dot Accuracy Threshold
     - type: dot_f1
+      value: 0.9302325581395349
       name: Dot F1
     - type: dot_f1_threshold
+      value: 209.4735870361328
       name: Dot F1 Threshold
     - type: dot_precision
+      value: 0.8888888888888888
       name: Dot Precision
     - type: dot_recall
+      value: 0.975609756097561
       name: Dot Recall
     - type: dot_ap
+      value: 0.9635932205663708
       name: Dot Ap
     - type: manhattan_accuracy
+      value: 0.9008264462809917
       name: Manhattan Accuracy
     - type: manhattan_accuracy_threshold
+      value: 558.378173828125
       name: Manhattan Accuracy Threshold
     - type: manhattan_f1
+      value: 0.9302325581395349
       name: Manhattan F1
     - type: manhattan_f1_threshold
+      value: 580.81640625
       name: Manhattan F1 Threshold
     - type: manhattan_precision
+      value: 0.8888888888888888
       name: Manhattan Precision
     - type: manhattan_recall
+      value: 0.975609756097561
       name: Manhattan Recall
     - type: manhattan_ap
+      value: 0.92846470083454
       name: Manhattan Ap
     - type: euclidean_accuracy
+      value: 0.9090909090909091
       name: Euclidean Accuracy
     - type: euclidean_accuracy_threshold
+      value: 24.130870819091797
       name: Euclidean Accuracy Threshold
     - type: euclidean_f1
+      value: 0.9341317365269461
       name: Euclidean F1
     - type: euclidean_f1_threshold
+      value: 24.130870819091797
       name: Euclidean F1 Threshold
     - type: euclidean_precision
+      value: 0.9176470588235294
       name: Euclidean Precision
     - type: euclidean_recall
+      value: 0.9512195121951219
       name: Euclidean Recall
     - type: euclidean_ap
+      value: 0.9287963056027329
       name: Euclidean Ap
     - type: max_accuracy
+      value: 0.9090909090909091
       name: Max Accuracy
     - type: max_accuracy_threshold
+      value: 558.378173828125
       name: Max Accuracy Threshold
     - type: max_f1
+      value: 0.9341317365269461
       name: Max F1
     - type: max_f1_threshold
+      value: 580.81640625
       name: Max F1 Threshold
     - type: max_precision
+      value: 0.9176470588235294
       name: Max Precision
     - type: max_recall
+      value: 0.975609756097561
       name: Max Recall
     - type: max_ap
+      value: 0.9635932205663708
       name: Max Ap
 ---
 # SentenceTransformer based on colorfulscoop/sbert-base-ja
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [colorfulscoop/sbert-base-ja](https://huggingface.co/colorfulscoop/sbert-base-ja) on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
 - **Maximum Sequence Length:** 512 tokens
 - **Output Dimensionality:** 768 tokens
 - **Similarity Function:** Cosine Similarity
+- **Training Dataset:**
+    - csv
 <!-- - **Language:** Unknown -->
 <!-- - **License:** Unknown -->
 model = SentenceTransformer("LeoChiuu/sbert-base-ja-arc")
 # Run inference
 sentences = [
+    'リリアンはどんな呪文が使えるの？',
+    '姿かたちを変える魔法',
+    'どのくらいのサイズ？',
 ]
 embeddings = model.encode(sentences)
 print(embeddings.shape)
 ### Metrics
 #### Binary Classification
+* Dataset: `custom-arc-semantics-data-jp`
 * Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
+| Metric                       | Value      |
+|:-----------------------------|:-----------|
+| cosine_accuracy              | 0.9091     |
+| cosine_accuracy_threshold    | 0.4786     |
+| cosine_f1                    | 0.9341     |
+| cosine_f1_threshold          | 0.4786     |
+| cosine_precision             | 0.9176     |
+| cosine_recall                | 0.9512     |
+| cosine_ap                    | 0.9288     |
+| dot_accuracy                 | 0.9008     |
+| dot_accuracy_threshold       | 234.108    |
+| dot_f1                       | 0.9302     |
+| dot_f1_threshold             | 209.4736   |
+| dot_precision                | 0.8889     |
+| dot_recall                   | 0.9756     |
+| dot_ap                       | 0.9636     |
+| manhattan_accuracy           | 0.9008     |
+| manhattan_accuracy_threshold | 558.3782   |
+| manhattan_f1                 | 0.9302     |
+| manhattan_f1_threshold       | 580.8164   |
+| manhattan_precision          | 0.8889     |
+| manhattan_recall             | 0.9756     |
+| manhattan_ap                 | 0.9285     |
+| euclidean_accuracy           | 0.9091     |
+| euclidean_accuracy_threshold | 24.1309    |
+| euclidean_f1                 | 0.9341     |
+| euclidean_f1_threshold       | 24.1309    |
+| euclidean_precision          | 0.9176     |
+| euclidean_recall             | 0.9512     |
+| euclidean_ap                 | 0.9288     |
+| max_accuracy                 | 0.9091     |
+| max_accuracy_threshold       | 558.3782   |
+| max_f1                       | 0.9341     |
+| max_f1_threshold             | 580.8164   |
+| max_precision                | 0.9176     |
+| max_recall                   | 0.9756     |
+| **max_ap**                   | **0.9636** |
 <!--
 ## Bias, Risks and Limitations
 ### Training Dataset
+#### csv
+* Dataset: csv
+* Size: 601 training samples
 * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
+* Approximate statistics based on the first 601 samples:
+  |         | text1                                                                            | text2                                                                            | label                                           |
+  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
+  | type    | string                                                                           | string                                                                           | int                                             |
+  | details | <ul><li>min: 4 tokens</li><li>mean: 7.99 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 8.05 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>0: ~33.96%</li><li>1: ~66.04%</li></ul> |
 * Samples:
+  | text1                   | text2                 | label          |
+  |:------------------------|:----------------------|:---------------|
+  | <code>どっちがいいと思う？</code> | <code>どっちが欲しい？</code> | <code>1</code> |
+  | <code>かわいいね</code>      | <code>ばか</code>       | <code>0</code> |
+  | <code>別のは選べないの？</code>  | <code>なにが欲しい？</code>  | <code>0</code> |
+* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
   ```json
   {
       "scale": 20.0,
+      "similarity_fct": "pairwise_cos_sim"
   }
   ```
 ### Evaluation Dataset
+#### csv
+* Dataset: csv
+* Size: 601 evaluation samples
 * Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
+* Approximate statistics based on the first 601 samples:
+  |         | text1                                                                            | text2                                                                            | label                                           |
+  |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------------------------------------|
+  | type    | string                                                                           | string                                                                           | int                                             |
+  | details | <ul><li>min: 4 tokens</li><li>mean: 8.26 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 7.94 tokens</li><li>max: 14 tokens</li></ul> | <ul><li>0: ~32.23%</li><li>1: ~67.77%</li></ul> |
 * Samples:
+  | text1                  | text2                   | label          |
+  |:-----------------------|:------------------------|:---------------|
+  | <code>誰かが魔法を使った</code> | <code>誰かがが魔法をかけた</code> | <code>1</code> |
+  | <code>これが花</code>      | <code>ぬいぐるみが花</code>    | <code>1</code> |
+  | <code>夜ご飯を作る前</code>   | <code>夜ご飯を食べる前</code>   | <code>1</code> |
+* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
   ```json
   {
       "scale": 20.0,
+      "similarity_fct": "pairwise_cos_sim"
   }
   ```
 </details>
 ### Training Logs
+| Epoch   | Step | Training Loss | loss   | custom-arc-semantics-data-jp_max_ap |
+|:-------:|:----:|:-------------:|:------:|:-----------------------------------:|
+| None    | 0    | -             | -      | 0.8596                              |
+| 1.0167  | 61   | 2.775         | 2.0852 | 0.8927                              |
+| 2.0167  | 122  | 1.213         | 1.7433 | 0.9291                              |
+| 3.0167  | 183  | 0.5703        | 1.5724 | 0.9379                              |
+| 4.0167  | 244  | 0.4603        | 1.6239 | 0.9432                              |
+| 5.0167  | 305  | 0.3672        | 1.6444 | 0.9523                              |
+| 6.0167  | 366  | 0.2947        | 1.6222 | 0.9603                              |
+| 7.0167  | 427  | 0.2255        | 1.7302 | 0.9619                              |
+| 8.0167  | 488  | 0.1678        | 1.7360 | 0.9633                              |
+| 9.0167  | 549  | 0.1163        | 1.8029 | 0.9620                              |
+| 10.0167 | 610  | 0.0706        | 1.8986 | 0.9639                              |
+| 11.0167 | 671  | 0.0389        | 1.9671 | 0.9624                              |
+| 12.0167 | 732  | 0.0333        | 2.0375 | 0.9636                              |
+| 12.8    | 780  | 0.0618        | 1.9938 | 0.9636                              |
 ### Framework Versions
 - Python: 3.10.14
+- Sentence Transformers: 3.1.0
 - Transformers: 4.44.2
 - PyTorch: 2.4.1+cu121
+- Accelerate: 0.34.2
 - Datasets: 2.20.0
 - Tokenizers: 0.19.1
 }
 ```
+#### CoSENTLoss
 ```bibtex
+@online{kexuefm-8847,
+    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
+    author={Su Jianlin},
+    year={2022},
+    month={Jan},
+    url={https://kexue.fm/archives/8847},
 }
 ```

config_sentence_transformers.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "__version__": {
-    "sentence_transformers": "3.0.1",
     "transformers": "4.44.2",
     "pytorch": "2.4.1+cu121"
   },

 {
   "__version__": {
+    "sentence_transformers": "3.1.0",
     "transformers": "4.44.2",
     "pytorch": "2.4.1+cu121"
   },

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:039721c1fad0c0fa6d3c342ca79d7eb552b0005e9c34c0bcc96c0455e340a82d
 size 442491744

 version https://git-lfs.github.com/spec/v1
+oid sha256:0900bde2010bc7e2818d70b31e4bbe7106bdb90d812cf99c8f8921b69fb1d8f2
 size 442491744