aspire commited on
Commit
159763b
1 Parent(s): 8314dbd

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -2
README.md CHANGED
@@ -1062,6 +1062,12 @@ model-index:
1062
 
1063
  ## acge model
1064
 
 
 
 
 
 
 
1065
  acge是一个通用的文本编码模型,是一个可变长度的向量化模型,使用了[Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147),如图所示:
1066
 
1067
  ![matryoshka-small](./img/matryoshka-small.gif)
@@ -1179,7 +1185,7 @@ print(similarity)
1179
  在sentence-transformer库中的使用方法,选取不同的维度:
1180
 
1181
  ```python
1182
- import torch
1183
  from sentence_transformers import SentenceTransformer
1184
 
1185
  sentences = ["数据1", "数据2"]
@@ -1187,8 +1193,11 @@ model = SentenceTransformer('acge_text_embedding')
1187
  embeddings = model.encode(sentences, normalize_embeddings=False)
1188
  matryoshka_dim = 1024
1189
  embeddings = embeddings[..., :matryoshka_dim] # Shrink the embedding dimensions
1190
- embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
1191
  print(embeddings.shape)
1192
  # => (2, 1024)
1193
 
1194
  ```
 
 
 
 
1062
 
1063
  ## acge model
1064
 
1065
+ ![logo](./img/logo.png)
1066
+
1067
+ acge模型来自于[合合信息](https://www.intsig.com/)技术团队,对外技术试用平台[TextIn](https://www.textin.com/)。合合信息是行业领先的人工智能及大数据科技企业,致力于通过智能文字识别及商业大数据领域的核心技术、C端和B端产品以及行业解决方案为全球企业和个人用户提供创新的数字化、智能化服务。
1068
+
1069
+ 技术交流请联系[yanhui](yanhui_he@intsig.net),商务合作联系[simon](simon_liu@intsig.net),可以[点击图片](https://huggingface.co/aspire/acge_text_embedding/img/wx.jpg),扫面二维码来加入我们的微信社群。
1070
+
1071
  acge是一个通用的文本编码模型,是一个可变长度的向量化模型,使用了[Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147),如图所示:
1072
 
1073
  ![matryoshka-small](./img/matryoshka-small.gif)
 
1185
  在sentence-transformer库中的使用方法,选取不同的维度:
1186
 
1187
  ```python
1188
+ from sklearn.preprocessing import normalize
1189
  from sentence_transformers import SentenceTransformer
1190
 
1191
  sentences = ["数据1", "数据2"]
 
1193
  embeddings = model.encode(sentences, normalize_embeddings=False)
1194
  matryoshka_dim = 1024
1195
  embeddings = embeddings[..., :matryoshka_dim] # Shrink the embedding dimensions
1196
+ embeddings = normalize(embeddings, norm="l2", axis=1)
1197
  print(embeddings.shape)
1198
  # => (2, 1024)
1199
 
1200
  ```
1201
+
1202
+
1203
+