Jinkin commited on
Commit
ac89eb8
1 Parent(s): cf5524e

update usage scripst

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -1083,3 +1083,29 @@ we provide scripts in "eval" folder for results reproducing.
1083
  | [bge-large-zh-no-instruct]| 1.3 | 1024 | 512 | 63.4 | 68.58 | 50.01 | 76.77 | 64.9 | 70.54 | 53 |
1084
  | [bge-base-zh]| 0.41 | 768 | 512 | 62.8 | 67.07 | 47.64 | 77.5 | 64.91 | 69.53 | 54.12 |
1085
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1083
  | [bge-large-zh-no-instruct]| 1.3 | 1024 | 512 | 63.4 | 68.58 | 50.01 | 76.77 | 64.9 | 70.54 | 53 |
1084
  | [bge-base-zh]| 0.41 | 768 | 512 | 62.8 | 67.07 | 47.64 | 77.5 | 64.91 | 69.53 | 54.12 |
1085
 
1086
+ ## Usage
1087
+ 在sentence-transformer package中可以很容易地调用piccolo模型
1088
+ ```python
1089
+ # for s2s dataset, you can use piccolo as below
1090
+ # 对于短对短数据集,下面是通用的使用方式
1091
+ from sentence_transformers import SentenceTransformer
1092
+ sentences = ["数据1", "数据2"]
1093
+ model = SentenceTransformer('sensenova/piccolo-base-zh')
1094
+ embeddings_1 = model.encode(sentences, normalize_embeddings=True)
1095
+ embeddings_2 = model.encode(sentences, normalize_embeddings=True)
1096
+ similarity = embeddings_1 @ embeddings_2.T
1097
+ print(similarity)
1098
+
1099
+ # for s2p dataset, we recommend to add instruction for passage retrieval
1100
+ # 对于短对长数据集,我们推荐添加instruction,来帮助模型更好地进行检索。
1101
+ from sentence_transformers import SentenceTransformer
1102
+ queries = ['query_1', 'query_2']
1103
+ passages = ["doc_1", "doc_2"]
1104
+
1105
+ model = SentenceTransformer('sensenova/piccolo-base-zh')
1106
+ q_embeddings = model.encode(["查询:" + q for q in queries], normalize_embeddings=True)
1107
+ p_embeddings = model.encode(["结果:" + p for p in passages], normalize_embeddings=True)
1108
+ scores = q_embeddings @ p_embeddings.T
1109
+
1110
+
1111
+ ```