新闻 | News
[2024-04-22]
piccolo-large-zh-v2 目前在C-MTEB榜单取得第一名,领先上一名BERT模型约1.9个点。
piccolo-large-zh-v2 currently ranks first on the C-MTEB list, leading the previous BERT model by about 1.9 points.
piccolo-large-zh-v2
piccolo-large-zh-v2 是一个通用embedding模型(中文), 由来自商汤科技的通用模型组完成训练,此次piccolo升级旨在更多地关注通用的下游finetune方式。我们将在近期更新我们的技术报告,同时详细技术细节也将在商汤4.23技术交流日披露: https://www.sensetime.com/cn
piccolo-large-zh-v2 is a Chinese embedding model developed by the general model group at SenseTime Research. This upgraded version of Piccolo aims to prioritize general downstream fine-tuning methods. We plan to release an updated technical report in the near future, and further technical details will be disclosed during the SenseTime Tech Day on April 23rd: https://www.sensetime.com/cn
Usage
目前该模型暂时需要通过API来进行访问: https://platform.sensenova.cn/doc?path=/chat/Embeddings/Embeddings.md
Currently, the model needs to be accessed through API: https://platform.sensenova.cn/doc?path=/chat/Embeddings/Embeddings.md
- Downloads last month
- 0
Evaluation results
- cos_sim_pearson on MTEB AFQMCvalidation set self-reported56.761
- cos_sim_spearman on MTEB AFQMCvalidation set self-reported61.493
- euclidean_pearson on MTEB AFQMCvalidation set self-reported59.145
- euclidean_spearman on MTEB AFQMCvalidation set self-reported60.636
- manhattan_pearson on MTEB AFQMCvalidation set self-reported59.147
- manhattan_spearman on MTEB AFQMCvalidation set self-reported60.635
- cos_sim_pearson on MTEB ATECtest set self-reported56.217
- cos_sim_spearman on MTEB ATECtest set self-reported59.198
- euclidean_pearson on MTEB ATECtest set self-reported62.378
- euclidean_spearman on MTEB ATECtest set self-reported58.794
- manhattan_pearson on MTEB ATECtest set self-reported62.370
- manhattan_spearman on MTEB ATECtest set self-reported58.792
- accuracy on MTEB AmazonReviewsClassification (zh)test set self-reported49.440
- f1 on MTEB AmazonReviewsClassification (zh)test set self-reported46.674
- cos_sim_pearson on MTEB BQtest set self-reported70.990
- cos_sim_spearman on MTEB BQtest set self-reported72.876
- euclidean_pearson on MTEB BQtest set self-reported71.177
- euclidean_spearman on MTEB BQtest set self-reported72.504
- manhattan_pearson on MTEB BQtest set self-reported71.173
- manhattan_spearman on MTEB BQtest set self-reported72.498
- v_measure on MTEB CLSClusteringP2Ptest set self-reported57.927
- v_measure on MTEB CLSClusteringS2Stest set self-reported48.097
- map on MTEB CMedQAv1test set self-reported89.310
- mrr on MTEB CMedQAv1test set self-reported91.381
- map on MTEB CMedQAv2test set self-reported90.138
- mrr on MTEB CMedQAv2test set self-reported92.143
- map_at_1 on MTEB CmedqaRetrievalself-reported26.931
- map_at_10 on MTEB CmedqaRetrievalself-reported40.647
- map_at_100 on MTEB CmedqaRetrievalself-reported42.519
- map_at_1000 on MTEB CmedqaRetrievalself-reported42.616
- map_at_3 on MTEB CmedqaRetrievalself-reported36.145
- map_at_5 on MTEB CmedqaRetrievalself-reported38.717
- mrr_at_1 on MTEB CmedqaRetrievalself-reported40.935
- mrr_at_10 on MTEB CmedqaRetrievalself-reported49.684
- mrr_at_100 on MTEB CmedqaRetrievalself-reported50.598
- mrr_at_1000 on MTEB CmedqaRetrievalself-reported50.633
- mrr_at_3 on MTEB CmedqaRetrievalself-reported47.070
- mrr_at_5 on MTEB CmedqaRetrievalself-reported48.490
- ndcg_at_1 on MTEB CmedqaRetrievalself-reported40.935
- ndcg_at_10 on MTEB CmedqaRetrievalself-reported47.584
- ndcg_at_100 on MTEB CmedqaRetrievalself-reported54.692
- ndcg_at_1000 on MTEB CmedqaRetrievalself-reported56.314
- ndcg_at_3 on MTEB CmedqaRetrievalself-reported41.973
- ndcg_at_5 on MTEB CmedqaRetrievalself-reported44.334
- precision_at_1 on MTEB CmedqaRetrievalself-reported40.935
- precision_at_10 on MTEB CmedqaRetrievalself-reported10.585
- precision_at_100 on MTEB CmedqaRetrievalself-reported1.637
- precision_at_1000 on MTEB CmedqaRetrievalself-reported0.184
- precision_at_3 on MTEB CmedqaRetrievalself-reported23.881
- precision_at_5 on MTEB CmedqaRetrievalself-reported17.399
- recall_at_1 on MTEB CmedqaRetrievalself-reported26.931
- recall_at_10 on MTEB CmedqaRetrievalself-reported59.006
- recall_at_100 on MTEB CmedqaRetrievalself-reported88.247
- recall_at_1000 on MTEB CmedqaRetrievalself-reported99.045
- recall_at_3 on MTEB CmedqaRetrievalself-reported42.064
- recall_at_5 on MTEB CmedqaRetrievalself-reported49.266
- cos_sim_accuracy on MTEB Cmnlivalidation set self-reported86.085
- cos_sim_ap on MTEB Cmnlivalidation set self-reported92.644
- cos_sim_f1 on MTEB Cmnlivalidation set self-reported86.900
- cos_sim_precision on MTEB Cmnlivalidation set self-reported84.114
- cos_sim_recall on MTEB Cmnlivalidation set self-reported89.876
- dot_accuracy on MTEB Cmnlivalidation set self-reported72.664
- dot_ap on MTEB Cmnlivalidation set self-reported81.053
- dot_f1 on MTEB Cmnlivalidation set self-reported75.199
- dot_precision on MTEB Cmnlivalidation set self-reported67.491
- dot_recall on MTEB Cmnlivalidation set self-reported84.896
- euclidean_accuracy on MTEB Cmnlivalidation set self-reported85.520
- euclidean_ap on MTEB Cmnlivalidation set self-reported91.906
- euclidean_f1 on MTEB Cmnlivalidation set self-reported86.264
- euclidean_precision on MTEB Cmnlivalidation set self-reported82.207
- euclidean_recall on MTEB Cmnlivalidation set self-reported90.741
- manhattan_accuracy on MTEB Cmnlivalidation set self-reported85.484
- manhattan_ap on MTEB Cmnlivalidation set self-reported91.897
- manhattan_f1 on MTEB Cmnlivalidation set self-reported86.204
- manhattan_precision on MTEB Cmnlivalidation set self-reported84.325
- manhattan_recall on MTEB Cmnlivalidation set self-reported88.169
- max_accuracy on MTEB Cmnlivalidation set self-reported86.085
- max_ap on MTEB Cmnlivalidation set self-reported92.644
- max_f1 on MTEB Cmnlivalidation set self-reported86.900