uer
/

sbert-base-chinese-nli

Sentence Similarity

sentence-transformers

feature-extraction

Inference Endpoints

Model card Files Files and versions Community

sbert

#5

by TANHL - opened Aug 21, 2023

base: refs/heads/main

←

from: refs/pr/5

Discussion Files changed

This PR is in draft mode

Files changed (1) hide show

README.md +6 -27

README.md CHANGED Viewed

@@ -8,31 +8,18 @@ tags:
 - transformers
 license: apache-2.0
 widget:
-    - source_sentence: "那个人很开心"
-      sentences:
-        - "那个人非常开心"
-        - "那只猫很开心"
-        - "那个人在吃东西"
 ---
 # Chinese Sentence BERT
 ## Model description
-This is the sentence embedding model pre-trained by [UER-py](https://github.com/dbiir/UER-py/), which is introduced in [this paper](https://arxiv.org/abs/1909.05658). Besides, the model could also be pre-trained by [TencentPretrain](https://github.com/Tencent/TencentPretrain) introduced in [this paper](https://arxiv.org/abs/2212.06385), which inherits UER-py to support models with parameters above one billion, and extends it to a multimodal pre-training framework.
-## How to use
-You can use this model to extract sentence embeddings for sentence similarity task. We use cosine distance to calculate the embedding similarity here:
-```python
->>> from sentence_transformers import SentenceTransformer
->>> model = SentenceTransformer('uer/sbert-base-chinese-nli')
->>> sentences = ['那个人很开心', '那个人非常开心']
->>> sentence_embeddings = model.encode(sentences)
->>> from sklearn.metrics.pairwise import paired_cosine_distances
->>> cosine_score = 1 - paired_cosine_distances([sentence_embeddings[0]],[sentence_embeddings[1]])
-```
 ## Training data
@@ -68,7 +55,6 @@ python3 scripts/convert_sbert_from_uer_to_huggingface.py --input_model_path mode
   journal={arXiv preprint arXiv:1908.10084},
   year={2019}
 }
 @article{zhao2019uer,
   title={UER: An Open-Source Toolkit for Pre-training Models},
   author={Zhao, Zhe and Chen, Hui and Zhang, Jinbin and Zhao, Xin and Liu, Tao and Lu, Wei and Chen, Xi and Deng, Haotang and Ju, Qi and Du, Xiaoyong},
@@ -76,11 +62,4 @@ python3 scripts/convert_sbert_from_uer_to_huggingface.py --input_model_path mode
   pages={241},
   year={2019}
 }
-@article{zhao2023tencentpretrain,
-  title={TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities},
-  author={Zhao, Zhe and Li, Yudong and Hou, Cheng and Zhao, Jing and others},
-  journal={ACL 2023},
-  pages={217},
-  year={2023}
 ```

 - transformers
 license: apache-2.0
 widget:
+    source_sentence: "那个人很开心"
+    sentences:
+        - 那个人非常开心
+        - 那只猫很开心
+        - 那个人在吃东西
 ---
 # Chinese Sentence BERT
 ## Model description
+This is the sentence embedding model pre-trained by [UER-py](https://github.com/dbiir/UER-py/), which is introduced in [this paper](https://arxiv.org/abs/1909.05658).
 ## Training data
   journal={arXiv preprint arXiv:1908.10084},
   year={2019}
 }
 @article{zhao2019uer,
   title={UER: An Open-Source Toolkit for Pre-training Models},
   author={Zhao, Zhe and Chen, Hui and Zhang, Jinbin and Zhao, Xin and Liu, Tao and Lu, Wei and Chen, Xi and Deng, Haotang and Ju, Qi and Du, Xiaoyong},
   pages={241},
   year={2019}
 }
 ```