新闻 | News
[2024-04-06] 开源puff系列模型,专门针对检索和语义匹配任务,更多的考虑泛化性和私有通用测试集效果,向量维度可变,中英双语。
[2024-02-27] 开源stella-mrl-large-zh-v3.5-1792d模型,支持向量可变维度。
[2024-02-17] 开源stella v3系列、dialogue编码模型和相关训练数据。
[2023-10-19] 开源stella-base-en-v2 使用简单,不需要任何前缀文本。
[2023-10-12] 开源stella-base-zh-v2和stella-large-zh-v2, 效果更好且使用简单,不需要任何前缀文本。
[2023-09-11] 开源stella-base-zh和stella-large-zh
欢迎去本人主页查看最新模型,并提出您的宝贵意见!
1 开源模型
本次开源stella-mrl-large-zh-v3.5-1792d模型, 本模型是在stella-large-zh-v3-1792d的基础上使用MRL方法训练而成。 其主要特点是可变的向量维度。
2 使用方法
from sentence_transformers import SentenceTransformer
from sklearn.preprocessing import normalize
model = SentenceTransformer("infgrad/stella-mrl-large-zh-v3.5-1792d")
# 注意先不要normalize! 选取前n维后再normalize
vectors = model.encode(["text1", "text2"], normalize_embeddings=False)
print(vectors.shape) # shape is [2,1792]
# n_dims越大效果越好,但是时空消耗就越大。建议维度选取128的倍数,因为是这么训练的
n_dims = 768
cut_vecs = normalize(vectors[:, :n_dims])
3 不同向量维度的CMTEB得分
stella-mrl-large-zh-v3.5-1792d_1024 代表取前1024维。整体趋势是维度越大效果越好。
Model | Retrieval | STS | PairClassification | Classification | Reranking | Clustering | CMTEB-Score |
---|---|---|---|---|---|---|---|
stella-mrl-large-zh-v3.5-1792d_128 | 70.01 | 62.17 | 87.99 | 70.67 | 66.77 | 53.55 | 67.16 |
stella-mrl-large-zh-v3.5-1792d_256 | 72.19 | 62.41 | 88.09 | 71.22 | 68.32 | 53.38 | 68.02 |
stella-mrl-large-zh-v3.5-1792d_384 | 72.77 | 62.43 | 88.26 | 71.34 | 68.31 | 53.87 | 68.25 |
stella-mrl-large-zh-v3.5-1792d_512 | 73.11 | 62.45 | 88.16 | 71.46 | 68.32 | 53.28 | 68.29 |
stella-mrl-large-zh-v3.5-1792d_640 | 73.27 | 62.49 | 88.21 | 71.46 | 68.69 | 53.63 | 68.42 |
stella-mrl-large-zh-v3.5-1792d_768 | 73.38 | 62.5 | 88.19 | 71.49 | 68.64 | 53.77 | 68.47 |
stella-mrl-large-zh-v3.5-1792d_896 | 73.37 | 62.5 | 88.14 | 71.51 | 68.44 | 54.13 | 68.49 |
stella-mrl-large-zh-v3.5-1792d_1024 | 73.43 | 62.51 | 88.16 | 71.52 | 68.59 | 53.43 | 68.44 |
stella-mrl-large-zh-v3.5-1792d_1152 | 73.46 | 62.49 | 88.16 | 71.57 | 68.55 | 53.67 | 68.49 |
stella-mrl-large-zh-v3.5-1792d_1280 | 73.48 | 62.51 | 88.12 | 71.55 | 68.44 | 53.74 | 68.48 |
stella-mrl-large-zh-v3.5-1792d_1408 | 73.48 | 62.51 | 88.14 | 71.58 | 68.46 | 53.69 | 68.48 |
stella-mrl-large-zh-v3.5-1792d_1536 | 73.49 | 62.5 | 88.11 | 71.55 | 68.5 | 54.06 | 68.52 |
stella-mrl-large-zh-v3.5-1792d_1664 | 73.56 | 62.49 | 88.06 | 71.56 | 68.47 | 54.28 | 68.56 |
stella-mrl-large-zh-v3.5-1792d_1792 | 73.51 | 62.48 | 88.09 | 71.56 | 68.45 | 54.39 | 68.56 |
上述表格中stella-mrl-large-zh-v3.5-1792d_1792的得分为68.56和榜单68.55得分不一致,原因和权重类型有关,小差异请忽略不计。
- Downloads last month
- 5,522
Evaluation results
- cos_sim_pearson on MTEB AFQMCvalidation set self-reported54.338
- cos_sim_spearman on MTEB AFQMCvalidation set self-reported58.855
- euclidean_pearson on MTEB AFQMCvalidation set self-reported57.570
- euclidean_spearman on MTEB AFQMCvalidation set self-reported58.855
- manhattan_pearson on MTEB AFQMCvalidation set self-reported57.559
- manhattan_spearman on MTEB AFQMCvalidation set self-reported58.845
- cos_sim_pearson on MTEB ATECtest set self-reported54.220
- cos_sim_spearman on MTEB ATECtest set self-reported58.080
- euclidean_pearson on MTEB ATECtest set self-reported61.646
- euclidean_spearman on MTEB ATECtest set self-reported58.080
- manhattan_pearson on MTEB ATECtest set self-reported61.645
- manhattan_spearman on MTEB ATECtest set self-reported58.081
- accuracy on MTEB AmazonReviewsClassification (zh)test set self-reported46.594
- f1 on MTEB AmazonReviewsClassification (zh)test set self-reported44.732
- cos_sim_pearson on MTEB BQtest set self-reported69.168
- cos_sim_spearman on MTEB BQtest set self-reported71.048
- euclidean_pearson on MTEB BQtest set self-reported69.951
- euclidean_spearman on MTEB BQtest set self-reported71.048
- manhattan_pearson on MTEB BQtest set self-reported69.925
- manhattan_spearman on MTEB BQtest set self-reported71.026
- v_measure on MTEB CLSClusteringP2Ptest set self-reported43.032
- v_measure on MTEB CLSClusteringS2Stest set self-reported40.416
- map on MTEB CMedQAv1test set self-reported89.335
- mrr on MTEB CMedQAv1test set self-reported91.346
- map on MTEB CMedQAv2test set self-reported89.178
- mrr on MTEB CMedQAv2test set self-reported91.096
- map_at_1 on MTEB CmedqaRetrievalself-reported26.809
- map_at_10 on MTEB CmedqaRetrievalself-reported39.906
- map_at_100 on MTEB CmedqaRetrievalself-reported41.858
- map_at_1000 on MTEB CmedqaRetrievalself-reported41.954
- map_at_3 on MTEB CmedqaRetrievalself-reported35.435
- map_at_5 on MTEB CmedqaRetrievalself-reported37.978
- mrr_at_1 on MTEB CmedqaRetrievalself-reported40.660
- mrr_at_10 on MTEB CmedqaRetrievalself-reported48.787
- mrr_at_100 on MTEB CmedqaRetrievalself-reported49.796
- mrr_at_1000 on MTEB CmedqaRetrievalself-reported49.832
- mrr_at_3 on MTEB CmedqaRetrievalself-reported46.166
- mrr_at_5 on MTEB CmedqaRetrievalself-reported47.675
- ndcg_at_1 on MTEB CmedqaRetrievalself-reported40.660
- ndcg_at_10 on MTEB CmedqaRetrievalself-reported46.614
- ndcg_at_100 on MTEB CmedqaRetrievalself-reported54.037
- ndcg_at_1000 on MTEB CmedqaRetrievalself-reported55.654
- ndcg_at_3 on MTEB CmedqaRetrievalself-reported41.032
- ndcg_at_5 on MTEB CmedqaRetrievalself-reported43.465
- precision_at_1 on MTEB CmedqaRetrievalself-reported40.660
- precision_at_10 on MTEB CmedqaRetrievalself-reported10.350
- precision_at_100 on MTEB CmedqaRetrievalself-reported1.634
- precision_at_1000 on MTEB CmedqaRetrievalself-reported0.184
- precision_at_3 on MTEB CmedqaRetrievalself-reported23.122
- precision_at_5 on MTEB CmedqaRetrievalself-reported16.944
- recall_at_1 on MTEB CmedqaRetrievalself-reported26.809
- recall_at_10 on MTEB CmedqaRetrievalself-reported57.474
- recall_at_100 on MTEB CmedqaRetrievalself-reported87.976
- recall_at_1000 on MTEB CmedqaRetrievalself-reported98.742
- recall_at_3 on MTEB CmedqaRetrievalself-reported40.819
- recall_at_5 on MTEB CmedqaRetrievalself-reported48.175
- cos_sim_accuracy on MTEB Cmnlivalidation set self-reported83.500
- cos_sim_ap on MTEB Cmnlivalidation set self-reported90.662
- cos_sim_f1 on MTEB Cmnlivalidation set self-reported84.391
- cos_sim_precision on MTEB Cmnlivalidation set self-reported79.537
- cos_sim_recall on MTEB Cmnlivalidation set self-reported89.876
- dot_accuracy on MTEB Cmnlivalidation set self-reported83.500
- dot_ap on MTEB Cmnlivalidation set self-reported90.647
- dot_f1 on MTEB Cmnlivalidation set self-reported84.391
- dot_precision on MTEB Cmnlivalidation set self-reported79.537
- dot_recall on MTEB Cmnlivalidation set self-reported89.876
- euclidean_accuracy on MTEB Cmnlivalidation set self-reported83.500
- euclidean_ap on MTEB Cmnlivalidation set self-reported90.662
- euclidean_f1 on MTEB Cmnlivalidation set self-reported84.391
- euclidean_precision on MTEB Cmnlivalidation set self-reported79.537
- euclidean_recall on MTEB Cmnlivalidation set self-reported89.876
- manhattan_accuracy on MTEB Cmnlivalidation set self-reported83.355
- manhattan_ap on MTEB Cmnlivalidation set self-reported90.645
- manhattan_f1 on MTEB Cmnlivalidation set self-reported84.375
- manhattan_precision on MTEB Cmnlivalidation set self-reported80.561
- manhattan_recall on MTEB Cmnlivalidation set self-reported88.567
- max_accuracy on MTEB Cmnlivalidation set self-reported83.500
- max_ap on MTEB Cmnlivalidation set self-reported90.662
- max_f1 on MTEB Cmnlivalidation set self-reported84.391