新闻 | News
[2024-04-06] 开源puff系列模型,专门针对检索和语义匹配任务,更多的考虑泛化性和私有通用测试集效果,向量维度可变,中英双语。
[2024-02-27] 开源stella-mrl-large-zh-v3.5-1792d模型,支持向量可变维度。
[2024-02-17] 开源stella v3系列、dialogue编码模型和相关训练数据。
[2023-10-19] 开源stella-base-en-v2 使用简单,不需要任何前缀文本。
[2023-10-12] 开源stella-base-zh-v2和stella-large-zh-v2, 效果更好且使用简单,不需要任何前缀文本。
[2023-09-11] 开源stella-base-zh和stella-large-zh
欢迎去本人主页查看最新模型,并提出您的宝贵意见!
1 开源模型
本次开源stella-mrl-large-zh-v3.5-1792d模型, 本模型是在stella-large-zh-v3-1792d的基础上使用MRL方法训练而成。 其主要特点是可变的向量维度。
2 使用方法
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize
model = SentenceTransformer("infgrad/stella-mrl-large-zh-v3.5-1792d")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape) # shape is [2,1792]
# n_dims越大效果越好,但是时空消耗就越大。建议维度选取128的倍数,因为是这么训练的
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])
3 不同向量维度的CMTEB得分
stella-mrl-large-zh-v3.5-1792d_1024 代表取前1024维。整体趋势是维度越大效果越好。
Model | Retrieval | STS | PairClassification | Classification | Reranking | Clustering | CMTEB-Score |
---|---|---|---|---|---|---|---|
stella-mrl-large-zh-v3.5-1792d_128 | 70.01 | 62.17 | 87.99 | 70.67 | 66.77 | 53.55 | 67.16 |
stella-mrl-large-zh-v3.5-1792d_256 | 72.19 | 62.41 | 88.09 | 71.22 | 68.32 | 53.38 | 68.02 |
stella-mrl-large-zh-v3.5-1792d_384 | 72.77 | 62.43 | 88.26 | 71.34 | 68.31 | 53.87 | 68.25 |
stella-mrl-large-zh-v3.5-1792d_512 | 73.11 | 62.45 | 88.16 | 71.46 | 68.32 | 53.28 | 68.29 |
stella-mrl-large-zh-v3.5-1792d_640 | 73.27 | 62.49 | 88.21 | 71.46 | 68.69 | 53.63 | 68.42 |
stella-mrl-large-zh-v3.5-1792d_768 | 73.38 | 62.5 | 88.19 | 71.49 | 68.64 | 53.77 | 68.47 |
stella-mrl-large-zh-v3.5-1792d_896 | 73.37 | 62.5 | 88.14 | 71.51 | 68.44 | 54.13 | 68.49 |
stella-mrl-large-zh-v3.5-1792d_1024 | 73.43 | 62.51 | 88.16 | 71.52 | 68.59 | 53.43 | 68.44 |
stella-mrl-large-zh-v3.5-1792d_1152 | 73.46 | 62.49 | 88.16 | 71.57 | 68.55 | 53.67 | 68.49 |
stella-mrl-large-zh-v3.5-1792d_1280 | 73.48 | 62.51 | 88.12 | 71.55 | 68.44 | 53.74 | 68.48 |
stella-mrl-large-zh-v3.5-1792d_1408 | 73.48 | 62.51 | 88.14 | 71.58 | 68.46 | 53.69 | 68.48 |
stella-mrl-large-zh-v3.5-1792d_1536 | 73.49 | 62.5 | 88.11 | 71.55 | 68.5 | 54.06 | 68.52 |
stella-mrl-large-zh-v3.5-1792d_1664 | 73.56 | 62.49 | 88.06 | 71.56 | 68.47 | 54.28 | 68.56 |
stella-mrl-large-zh-v3.5-1792d_1792 | 73.51 | 62.48 | 88.09 | 71.56 | 68.45 | 54.39 | 68.56 |
上述表格中stella-mrl-large-zh-v3.5-1792d_1792的得分为68.56和榜单68.55得分不一致,原因和权重类型有关,小差异请忽略不计。
- Downloads last month
- 73,092
Spaces using dunzhang/stella-mrl-large-zh-v3.5-1792d 2
Evaluation results
- cos_sim_pearson on MTEB AFQMCvalidation set self-reported54.338
- cos_sim_spearman on MTEB AFQMCvalidation set self-reported58.855
- euclidean_pearson on MTEB AFQMCvalidation set self-reported57.570
- euclidean_spearman on MTEB AFQMCvalidation set self-reported58.855
- manhattan_pearson on MTEB AFQMCvalidation set self-reported57.559
- manhattan_spearman on MTEB AFQMCvalidation set self-reported58.845
- cos_sim_pearson on MTEB ATECtest set self-reported54.220
- cos_sim_spearman on MTEB ATECtest set self-reported58.080
- euclidean_pearson on MTEB ATECtest set self-reported61.646
- euclidean_spearman on MTEB ATECtest set self-reported58.080