--- pipeline_tag: sentence-similarity tags: - finetuner - mteb - sentence-transformers - feature-extraction - sentence-similarity datasets: - jinaai/negation-dataset language: en license: apache-2.0 --- tags: - mteb model-index: - name: jina-embedding-s-en-v1 results: - task: type: Classification dataset: type: mteb/amazon_counterfactual name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 64.82089552238806 - type: ap value: 27.100981946230778 - type: f1 value: 58.3354886367184 - task: type: Classification dataset: type: mteb/amazon_polarity name: MTEB AmazonPolarityClassification config: default split: test revision: e2d317d38cd51312af73b3d32a06d1a08b442046 metrics: - type: accuracy value: 64.282775 - type: ap value: 60.350688924943796 - type: f1 value: 62.06346948494396 - task: type: Classification dataset: type: mteb/amazon_reviews_multi name: MTEB AmazonReviewsClassification (en) config: en split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 30.623999999999995 - type: f1 value: 29.427789186742153 - task: type: Retrieval dataset: type: arguana name: MTEB ArguAna config: default split: test revision: None metrics: - type: map_at_1 value: 22.119 - type: map_at_10 value: 35.609 - type: map_at_100 value: 36.935 - type: map_at_1000 value: 36.957 - type: map_at_3 value: 31.046000000000003 - type: map_at_5 value: 33.574 - type: mrr_at_1 value: 22.404 - type: mrr_at_10 value: 35.695 - type: mrr_at_100 value: 37.021 - type: mrr_at_1000 value: 37.043 - type: mrr_at_3 value: 31.093 - type: mrr_at_5 value: 33.635999999999996 - type: ndcg_at_1 value: 22.119 - type: ndcg_at_10 value: 43.566 - type: ndcg_at_100 value: 49.370000000000005 - type: ndcg_at_1000 value: 49.901 - type: ndcg_at_3 value: 34.06 - type: ndcg_at_5 value: 38.653999999999996 - type: precision_at_1 value: 22.119 - type: precision_at_10 value: 6.92 - type: precision_at_100 value: 0.95 - type: precision_at_1000 value: 0.099 - type: precision_at_3 value: 14.272000000000002 - type: precision_at_5 value: 10.811 - type: recall_at_1 value: 22.119 - type: recall_at_10 value: 69.203 - type: recall_at_100 value: 95.021 - type: recall_at_1000 value: 99.075 - type: recall_at_3 value: 42.817 - type: recall_at_5 value: 54.054 - task: type: Clustering dataset: type: mteb/arxiv-clustering-p2p name: MTEB ArxivClusteringP2P config: default split: test revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d metrics: - type: v_measure value: 34.1740289109719 - task: type: Clustering dataset: type: mteb/arxiv-clustering-s2s name: MTEB ArxivClusteringS2S config: default split: test revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 metrics: - type: v_measure value: 23.985251383455463 - task: type: Reranking dataset: type: mteb/askubuntudupquestions-reranking name: MTEB AskUbuntuDupQuestions config: default split: test revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 metrics: - type: map value: 60.24873612289029 - type: mrr value: 74.65692740623489 - task: type: STS dataset: type: mteb/biosses-sts name: MTEB BIOSSES config: default split: test revision: d3fb88f8f02e40887cd149695127462bbcf29b4a metrics: - type: cos_sim_pearson value: 86.22415390332444 - type: cos_sim_spearman value: 82.9591191954711 - type: euclidean_pearson value: 44.096317524324945 - type: euclidean_spearman value: 42.95218351391625 - type: manhattan_pearson value: 44.07766490545065 - type: manhattan_spearman value: 42.78350497166606 - task: type: Classification dataset: type: mteb/banking77 name: MTEB Banking77Classification config: default split: test revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 metrics: - type: accuracy value: 74.64285714285714 - type: f1 value: 73.53680835577447 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-p2p name: MTEB BiorxivClusteringP2P config: default split: test revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 metrics: - type: v_measure value: 28.512813238490164 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-s2s name: MTEB BiorxivClusteringS2S config: default split: test revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 metrics: - type: v_measure value: 20.942214972649488 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackAndroidRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 28.255999999999997 - type: map_at_10 value: 37.091 - type: map_at_100 value: 38.428000000000004 - type: map_at_1000 value: 38.559 - type: map_at_3 value: 34.073 - type: map_at_5 value: 35.739 - type: mrr_at_1 value: 34.907 - type: mrr_at_10 value: 42.769 - type: mrr_at_100 value: 43.607 - type: mrr_at_1000 value: 43.656 - type: mrr_at_3 value: 39.986 - type: mrr_at_5 value: 41.581 - type: ndcg_at_1 value: 34.907 - type: ndcg_at_10 value: 42.681000000000004 - type: ndcg_at_100 value: 48.213 - type: ndcg_at_1000 value: 50.464 - type: ndcg_at_3 value: 37.813 - type: ndcg_at_5 value: 39.936 - type: precision_at_1 value: 34.907 - type: precision_at_10 value: 7.911 - type: precision_at_100 value: 1.349 - type: precision_at_1000 value: 0.184 - type: precision_at_3 value: 17.93 - type: precision_at_5 value: 12.732 - type: recall_at_1 value: 28.255999999999997 - type: recall_at_10 value: 53.49699999999999 - type: recall_at_100 value: 77.288 - type: recall_at_1000 value: 91.776 - type: recall_at_3 value: 39.18 - type: recall_at_5 value: 45.365 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackEnglishRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 25.563999999999997 - type: map_at_10 value: 33.913 - type: map_at_100 value: 34.966 - type: map_at_1000 value: 35.104 - type: map_at_3 value: 31.413000000000004 - type: map_at_5 value: 32.854 - type: mrr_at_1 value: 31.72 - type: mrr_at_10 value: 39.391 - type: mrr_at_100 value: 40.02 - type: mrr_at_1000 value: 40.076 - type: mrr_at_3 value: 37.314 - type: mrr_at_5 value: 38.507999999999996 - type: ndcg_at_1 value: 31.72 - type: ndcg_at_10 value: 38.933 - type: ndcg_at_100 value: 43.024 - type: ndcg_at_1000 value: 45.556999999999995 - type: ndcg_at_3 value: 35.225 - type: ndcg_at_5 value: 36.984 - type: precision_at_1 value: 31.72 - type: precision_at_10 value: 7.248 - type: precision_at_100 value: 1.192 - type: precision_at_1000 value: 0.16999999999999998 - type: precision_at_3 value: 16.943 - type: precision_at_5 value: 11.975 - type: recall_at_1 value: 25.563999999999997 - type: recall_at_10 value: 47.808 - type: recall_at_100 value: 65.182 - type: recall_at_1000 value: 81.831 - type: recall_at_3 value: 36.889 - type: recall_at_5 value: 41.829 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGamingRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 33.662 - type: map_at_10 value: 44.096999999999994 - type: map_at_100 value: 45.153999999999996 - type: map_at_1000 value: 45.223 - type: map_at_3 value: 41.377 - type: map_at_5 value: 42.935 - type: mrr_at_1 value: 38.997 - type: mrr_at_10 value: 47.675 - type: mrr_at_100 value: 48.476 - type: mrr_at_1000 value: 48.519 - type: mrr_at_3 value: 45.549 - type: mrr_at_5 value: 46.884 - type: ndcg_at_1 value: 38.997 - type: ndcg_at_10 value: 49.196 - type: ndcg_at_100 value: 53.788000000000004 - type: ndcg_at_1000 value: 55.393 - type: ndcg_at_3 value: 44.67 - type: ndcg_at_5 value: 46.991 - type: precision_at_1 value: 38.997 - type: precision_at_10 value: 7.875 - type: precision_at_100 value: 1.102 - type: precision_at_1000 value: 0.13 - type: precision_at_3 value: 19.854 - type: precision_at_5 value: 13.605 - type: recall_at_1 value: 33.662 - type: recall_at_10 value: 60.75899999999999 - type: recall_at_100 value: 81.11699999999999 - type: recall_at_1000 value: 92.805 - type: recall_at_3 value: 48.577999999999996 - type: recall_at_5 value: 54.384 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGisRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 21.313 - type: map_at_10 value: 29.036 - type: map_at_100 value: 29.975 - type: map_at_1000 value: 30.063000000000002 - type: map_at_3 value: 26.878999999999998 - type: map_at_5 value: 28.005999999999997 - type: mrr_at_1 value: 23.39 - type: mrr_at_10 value: 31.072 - type: mrr_at_100 value: 31.922 - type: mrr_at_1000 value: 31.995 - type: mrr_at_3 value: 28.908 - type: mrr_at_5 value: 30.104999999999997 - type: ndcg_at_1 value: 23.39 - type: ndcg_at_10 value: 33.448 - type: ndcg_at_100 value: 38.255 - type: ndcg_at_1000 value: 40.542 - type: ndcg_at_3 value: 29.060000000000002 - type: ndcg_at_5 value: 31.023 - type: precision_at_1 value: 23.39 - type: precision_at_10 value: 5.175 - type: precision_at_100 value: 0.8049999999999999 - type: precision_at_1000 value: 0.10300000000000001 - type: precision_at_3 value: 12.504999999999999 - type: precision_at_5 value: 8.61 - type: recall_at_1 value: 21.313 - type: recall_at_10 value: 45.345 - type: recall_at_100 value: 67.752 - type: recall_at_1000 value: 84.937 - type: recall_at_3 value: 33.033 - type: recall_at_5 value: 37.929 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackMathematicaRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 14.255999999999998 - type: map_at_10 value: 20.339 - type: map_at_100 value: 21.491 - type: map_at_1000 value: 21.616 - type: map_at_3 value: 18.481 - type: map_at_5 value: 19.594 - type: mrr_at_1 value: 17.413 - type: mrr_at_10 value: 24.146 - type: mrr_at_100 value: 25.188 - type: mrr_at_1000 value: 25.273 - type: mrr_at_3 value: 22.264 - type: mrr_at_5 value: 23.302 - type: ndcg_at_1 value: 17.413 - type: ndcg_at_10 value: 24.272 - type: ndcg_at_100 value: 29.82 - type: ndcg_at_1000 value: 33.072 - type: ndcg_at_3 value: 20.826 - type: ndcg_at_5 value: 22.535 - type: precision_at_1 value: 17.413 - type: precision_at_10 value: 4.366 - type: precision_at_100 value: 0.818 - type: precision_at_1000 value: 0.124 - type: precision_at_3 value: 9.866999999999999 - type: precision_at_5 value: 7.164 - type: recall_at_1 value: 14.255999999999998 - type: recall_at_10 value: 32.497 - type: recall_at_100 value: 56.592 - type: recall_at_1000 value: 80.17699999999999 - type: recall_at_3 value: 23.195 - type: recall_at_5 value: 27.392 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackPhysicsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 22.709 - type: map_at_10 value: 31.377 - type: map_at_100 value: 32.536 - type: map_at_1000 value: 32.669 - type: map_at_3 value: 28.572999999999997 - type: map_at_5 value: 30.205 - type: mrr_at_1 value: 27.815 - type: mrr_at_10 value: 36.452 - type: mrr_at_100 value: 37.302 - type: mrr_at_1000 value: 37.364000000000004 - type: mrr_at_3 value: 33.75 - type: mrr_at_5 value: 35.43 - type: ndcg_at_1 value: 27.815 - type: ndcg_at_10 value: 36.84 - type: ndcg_at_100 value: 42.092 - type: ndcg_at_1000 value: 44.727 - type: ndcg_at_3 value: 31.964 - type: ndcg_at_5 value: 34.428 - type: precision_at_1 value: 27.815 - type: precision_at_10 value: 6.67 - type: precision_at_100 value: 1.093 - type: precision_at_1000 value: 0.151 - type: precision_at_3 value: 14.982000000000001 - type: precision_at_5 value: 10.857 - type: recall_at_1 value: 22.709 - type: recall_at_10 value: 48.308 - type: recall_at_100 value: 70.866 - type: recall_at_1000 value: 88.236 - type: recall_at_3 value: 34.709 - type: recall_at_5 value: 40.996 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackProgrammersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 22.348000000000003 - type: map_at_10 value: 29.427999999999997 - type: map_at_100 value: 30.499 - type: map_at_1000 value: 30.631999999999998 - type: map_at_3 value: 27.035999999999998 - type: map_at_5 value: 28.351 - type: mrr_at_1 value: 27.74 - type: mrr_at_10 value: 34.424 - type: mrr_at_100 value: 35.341 - type: mrr_at_1000 value: 35.419 - type: mrr_at_3 value: 32.401 - type: mrr_at_5 value: 33.497 - type: ndcg_at_1 value: 27.74 - type: ndcg_at_10 value: 34.136 - type: ndcg_at_100 value: 39.269 - type: ndcg_at_1000 value: 42.263 - type: ndcg_at_3 value: 30.171999999999997 - type: ndcg_at_5 value: 31.956 - type: precision_at_1 value: 27.74 - type: precision_at_10 value: 6.062 - type: precision_at_100 value: 1.014 - type: precision_at_1000 value: 0.146 - type: precision_at_3 value: 14.079 - type: precision_at_5 value: 9.977 - type: recall_at_1 value: 22.348000000000003 - type: recall_at_10 value: 43.477 - type: recall_at_100 value: 65.945 - type: recall_at_1000 value: 86.587 - type: recall_at_3 value: 32.107 - type: recall_at_5 value: 36.974000000000004 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 21.688499999999998 - type: map_at_10 value: 29.164666666666665 - type: map_at_100 value: 30.22575 - type: map_at_1000 value: 30.350833333333334 - type: map_at_3 value: 26.82025 - type: map_at_5 value: 28.14966666666667 - type: mrr_at_1 value: 25.779249999999998 - type: mrr_at_10 value: 32.969 - type: mrr_at_100 value: 33.81725 - type: mrr_at_1000 value: 33.88825 - type: mrr_at_3 value: 30.831250000000004 - type: mrr_at_5 value: 32.065000000000005 - type: ndcg_at_1 value: 25.779249999999998 - type: ndcg_at_10 value: 33.73675 - type: ndcg_at_100 value: 38.635666666666665 - type: ndcg_at_1000 value: 41.353500000000004 - type: ndcg_at_3 value: 29.66283333333333 - type: ndcg_at_5 value: 31.607249999999997 - type: precision_at_1 value: 25.779249999999998 - type: precision_at_10 value: 5.861416666666667 - type: precision_at_100 value: 0.9852500000000002 - type: precision_at_1000 value: 0.14108333333333334 - type: precision_at_3 value: 13.563583333333332 - type: precision_at_5 value: 9.630333333333335 - type: recall_at_1 value: 21.688499999999998 - type: recall_at_10 value: 43.605 - type: recall_at_100 value: 65.52366666666667 - type: recall_at_1000 value: 84.69683333333332 - type: recall_at_3 value: 32.195499999999996 - type: recall_at_5 value: 37.25325 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackStatsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 17.279 - type: map_at_10 value: 23.238 - type: map_at_100 value: 24.026 - type: map_at_1000 value: 24.13 - type: map_at_3 value: 20.730999999999998 - type: map_at_5 value: 22.278000000000002 - type: mrr_at_1 value: 19.017999999999997 - type: mrr_at_10 value: 25.188 - type: mrr_at_100 value: 25.918999999999997 - type: mrr_at_1000 value: 25.996999999999996 - type: mrr_at_3 value: 22.776 - type: mrr_at_5 value: 24.256 - type: ndcg_at_1 value: 19.017999999999997 - type: ndcg_at_10 value: 27.171 - type: ndcg_at_100 value: 31.274 - type: ndcg_at_1000 value: 34.016000000000005 - type: ndcg_at_3 value: 22.442 - type: ndcg_at_5 value: 24.955 - type: precision_at_1 value: 19.017999999999997 - type: precision_at_10 value: 4.494 - type: precision_at_100 value: 0.712 - type: precision_at_1000 value: 0.10300000000000001 - type: precision_at_3 value: 9.611 - type: precision_at_5 value: 7.331 - type: recall_at_1 value: 17.279 - type: recall_at_10 value: 37.464999999999996 - type: recall_at_100 value: 56.458 - type: recall_at_1000 value: 76.759 - type: recall_at_3 value: 24.659 - type: recall_at_5 value: 30.672 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackTexRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 14.901 - type: map_at_10 value: 20.268 - type: map_at_100 value: 21.143 - type: map_at_1000 value: 21.264 - type: map_at_3 value: 18.557000000000002 - type: map_at_5 value: 19.483 - type: mrr_at_1 value: 17.997 - type: mrr_at_10 value: 23.591 - type: mrr_at_100 value: 24.387 - type: mrr_at_1000 value: 24.471 - type: mrr_at_3 value: 21.874 - type: mrr_at_5 value: 22.797 - type: ndcg_at_1 value: 17.997 - type: ndcg_at_10 value: 23.87 - type: ndcg_at_100 value: 28.459 - type: ndcg_at_1000 value: 31.66 - type: ndcg_at_3 value: 20.779 - type: ndcg_at_5 value: 22.137 - type: precision_at_1 value: 17.997 - type: precision_at_10 value: 4.25 - type: precision_at_100 value: 0.761 - type: precision_at_1000 value: 0.121 - type: precision_at_3 value: 9.716 - type: precision_at_5 value: 6.909999999999999 - type: recall_at_1 value: 14.901 - type: recall_at_10 value: 31.44 - type: recall_at_100 value: 52.717000000000006 - type: recall_at_1000 value: 76.102 - type: recall_at_3 value: 22.675 - type: recall_at_5 value: 26.336 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackUnixRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 21.52 - type: map_at_10 value: 28.397 - type: map_at_100 value: 29.443 - type: map_at_1000 value: 29.56 - type: map_at_3 value: 26.501 - type: map_at_5 value: 27.375 - type: mrr_at_1 value: 25.28 - type: mrr_at_10 value: 32.102000000000004 - type: mrr_at_100 value: 33.005 - type: mrr_at_1000 value: 33.084 - type: mrr_at_3 value: 30.208000000000002 - type: mrr_at_5 value: 31.146 - type: ndcg_at_1 value: 25.28 - type: ndcg_at_10 value: 32.635 - type: ndcg_at_100 value: 37.672 - type: ndcg_at_1000 value: 40.602 - type: ndcg_at_3 value: 28.951999999999998 - type: ndcg_at_5 value: 30.336999999999996 - type: precision_at_1 value: 25.28 - type: precision_at_10 value: 5.3260000000000005 - type: precision_at_100 value: 0.8840000000000001 - type: precision_at_1000 value: 0.126 - type: precision_at_3 value: 12.687000000000001 - type: precision_at_5 value: 8.638 - type: recall_at_1 value: 21.52 - type: recall_at_10 value: 41.955 - type: recall_at_100 value: 64.21 - type: recall_at_1000 value: 85.28099999999999 - type: recall_at_3 value: 31.979999999999997 - type: recall_at_5 value: 35.406 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWebmastersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 20.296 - type: map_at_10 value: 28.449999999999996 - type: map_at_100 value: 29.847 - type: map_at_1000 value: 30.073 - type: map_at_3 value: 25.995 - type: map_at_5 value: 27.603 - type: mrr_at_1 value: 25.296000000000003 - type: mrr_at_10 value: 32.751999999999995 - type: mrr_at_100 value: 33.705 - type: mrr_at_1000 value: 33.783 - type: mrr_at_3 value: 30.731 - type: mrr_at_5 value: 32.006 - type: ndcg_at_1 value: 25.296000000000003 - type: ndcg_at_10 value: 33.555 - type: ndcg_at_100 value: 38.891999999999996 - type: ndcg_at_1000 value: 42.088 - type: ndcg_at_3 value: 29.944 - type: ndcg_at_5 value: 31.997999999999998 - type: precision_at_1 value: 25.296000000000003 - type: precision_at_10 value: 6.542000000000001 - type: precision_at_100 value: 1.354 - type: precision_at_1000 value: 0.22599999999999998 - type: precision_at_3 value: 14.360999999999999 - type: precision_at_5 value: 10.593 - type: recall_at_1 value: 20.296 - type: recall_at_10 value: 42.742000000000004 - type: recall_at_100 value: 67.351 - type: recall_at_1000 value: 88.774 - type: recall_at_3 value: 32.117000000000004 - type: recall_at_5 value: 37.788 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWordpressRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 18.157999999999998 - type: map_at_10 value: 24.342 - type: map_at_100 value: 25.201 - type: map_at_1000 value: 25.317 - type: map_at_3 value: 22.227 - type: map_at_5 value: 23.372999999999998 - type: mrr_at_1 value: 19.778000000000002 - type: mrr_at_10 value: 26.066 - type: mrr_at_100 value: 26.935 - type: mrr_at_1000 value: 27.022000000000002 - type: mrr_at_3 value: 24.214 - type: mrr_at_5 value: 25.268 - type: ndcg_at_1 value: 19.778000000000002 - type: ndcg_at_10 value: 28.104000000000003 - type: ndcg_at_100 value: 32.87 - type: ndcg_at_1000 value: 35.858000000000004 - type: ndcg_at_3 value: 24.107 - type: ndcg_at_5 value: 26.007 - type: precision_at_1 value: 19.778000000000002 - type: precision_at_10 value: 4.417999999999999 - type: precision_at_100 value: 0.739 - type: precision_at_1000 value: 0.109 - type: precision_at_3 value: 10.228 - type: precision_at_5 value: 7.172000000000001 - type: recall_at_1 value: 18.157999999999998 - type: recall_at_10 value: 37.967 - type: recall_at_100 value: 60.806000000000004 - type: recall_at_1000 value: 83.097 - type: recall_at_3 value: 27.223999999999997 - type: recall_at_5 value: 31.968000000000004 - task: type: Retrieval dataset: type: climate-fever name: MTEB ClimateFEVER config: default split: test revision: None metrics: - type: map_at_1 value: 7.055 - type: map_at_10 value: 11.609 - type: map_at_100 value: 12.83 - type: map_at_1000 value: 12.995000000000001 - type: map_at_3 value: 9.673 - type: map_at_5 value: 10.761999999999999 - type: mrr_at_1 value: 15.309000000000001 - type: mrr_at_10 value: 23.655 - type: mrr_at_100 value: 24.785 - type: mrr_at_1000 value: 24.856 - type: mrr_at_3 value: 20.499000000000002 - type: mrr_at_5 value: 22.425 - type: ndcg_at_1 value: 15.309000000000001 - type: ndcg_at_10 value: 17.252000000000002 - type: ndcg_at_100 value: 22.976 - type: ndcg_at_1000 value: 26.480999999999998 - type: ndcg_at_3 value: 13.418 - type: ndcg_at_5 value: 15.084 - type: precision_at_1 value: 15.309000000000001 - type: precision_at_10 value: 5.309 - type: precision_at_100 value: 1.1320000000000001 - type: precision_at_1000 value: 0.17600000000000002 - type: precision_at_3 value: 9.62 - type: precision_at_5 value: 7.883 - type: recall_at_1 value: 7.055 - type: recall_at_10 value: 21.891 - type: recall_at_100 value: 41.979 - type: recall_at_1000 value: 62.239999999999995 - type: recall_at_3 value: 12.722 - type: recall_at_5 value: 16.81 - task: type: Retrieval dataset: type: dbpedia-entity name: MTEB DBPedia config: default split: test revision: None metrics: - type: map_at_1 value: 6.909 - type: map_at_10 value: 12.844 - type: map_at_100 value: 16.435 - type: map_at_1000 value: 17.262 - type: map_at_3 value: 10.131 - type: map_at_5 value: 11.269 - type: mrr_at_1 value: 54.50000000000001 - type: mrr_at_10 value: 62.202 - type: mrr_at_100 value: 62.81 - type: mrr_at_1000 value: 62.824000000000005 - type: mrr_at_3 value: 60.5 - type: mrr_at_5 value: 61.324999999999996 - type: ndcg_at_1 value: 42.125 - type: ndcg_at_10 value: 28.284 - type: ndcg_at_100 value: 30.444 - type: ndcg_at_1000 value: 36.397 - type: ndcg_at_3 value: 33.439 - type: ndcg_at_5 value: 30.473 - type: precision_at_1 value: 54.50000000000001 - type: precision_at_10 value: 21.4 - type: precision_at_100 value: 6.192 - type: precision_at_1000 value: 1.398 - type: precision_at_3 value: 36.583 - type: precision_at_5 value: 28.799999999999997 - type: recall_at_1 value: 6.909 - type: recall_at_10 value: 17.296 - type: recall_at_100 value: 33.925 - type: recall_at_1000 value: 53.786 - type: recall_at_3 value: 11.333 - type: recall_at_5 value: 13.529 - task: type: Classification dataset: type: mteb/emotion name: MTEB EmotionClassification config: default split: test revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 metrics: - type: accuracy value: 36.08 - type: f1 value: 33.016420191943766 - task: type: Retrieval dataset: type: fever name: MTEB FEVER config: default split: test revision: None metrics: - type: map_at_1 value: 52.605000000000004 - type: map_at_10 value: 63.31400000000001 - type: map_at_100 value: 63.678000000000004 - type: map_at_1000 value: 63.699 - type: map_at_3 value: 61.141 - type: map_at_5 value: 62.517999999999994 - type: mrr_at_1 value: 56.871 - type: mrr_at_10 value: 67.915 - type: mrr_at_100 value: 68.24900000000001 - type: mrr_at_1000 value: 68.262 - type: mrr_at_3 value: 65.809 - type: mrr_at_5 value: 67.171 - type: ndcg_at_1 value: 56.871 - type: ndcg_at_10 value: 69.122 - type: ndcg_at_100 value: 70.855 - type: ndcg_at_1000 value: 71.368 - type: ndcg_at_3 value: 64.974 - type: ndcg_at_5 value: 67.318 - type: precision_at_1 value: 56.871 - type: precision_at_10 value: 9.029 - type: precision_at_100 value: 0.996 - type: precision_at_1000 value: 0.105 - type: precision_at_3 value: 25.893 - type: precision_at_5 value: 16.838 - type: recall_at_1 value: 52.605000000000004 - type: recall_at_10 value: 82.679 - type: recall_at_100 value: 90.586 - type: recall_at_1000 value: 94.38 - type: recall_at_3 value: 71.447 - type: recall_at_5 value: 77.218 - task: type: Retrieval dataset: type: fiqa name: MTEB FiQA2018 config: default split: test revision: None metrics: - type: map_at_1 value: 10.759 - type: map_at_10 value: 18.877 - type: map_at_100 value: 20.498 - type: map_at_1000 value: 20.682000000000002 - type: map_at_3 value: 16.159000000000002 - type: map_at_5 value: 17.575 - type: mrr_at_1 value: 22.531000000000002 - type: mrr_at_10 value: 31.155 - type: mrr_at_100 value: 32.188 - type: mrr_at_1000 value: 32.245000000000005 - type: mrr_at_3 value: 28.781000000000002 - type: mrr_at_5 value: 30.054 - type: ndcg_at_1 value: 22.531000000000002 - type: ndcg_at_10 value: 25.189 - type: ndcg_at_100 value: 31.958 - type: ndcg_at_1000 value: 35.693999999999996 - type: ndcg_at_3 value: 22.235 - type: ndcg_at_5 value: 23.044999999999998 - type: precision_at_1 value: 22.531000000000002 - type: precision_at_10 value: 7.438000000000001 - type: precision_at_100 value: 1.418 - type: precision_at_1000 value: 0.208 - type: precision_at_3 value: 15.329 - type: precision_at_5 value: 11.451 - type: recall_at_1 value: 10.759 - type: recall_at_10 value: 31.416 - type: recall_at_100 value: 56.989000000000004 - type: recall_at_1000 value: 80.33200000000001 - type: recall_at_3 value: 20.61 - type: recall_at_5 value: 24.903 - task: type: Retrieval dataset: type: hotpotqa name: MTEB HotpotQA config: default split: test revision: None metrics: - type: map_at_1 value: 29.21 - type: map_at_10 value: 38.765 - type: map_at_100 value: 39.498 - type: map_at_1000 value: 39.568 - type: map_at_3 value: 36.699 - type: map_at_5 value: 37.925 - type: mrr_at_1 value: 58.42 - type: mrr_at_10 value: 65.137 - type: mrr_at_100 value: 65.542 - type: mrr_at_1000 value: 65.568 - type: mrr_at_3 value: 63.698 - type: mrr_at_5 value: 64.575 - type: ndcg_at_1 value: 58.42 - type: ndcg_at_10 value: 47.476 - type: ndcg_at_100 value: 50.466 - type: ndcg_at_1000 value: 52.064 - type: ndcg_at_3 value: 43.986 - type: ndcg_at_5 value: 45.824 - type: precision_at_1 value: 58.42 - type: precision_at_10 value: 9.649000000000001 - type: precision_at_100 value: 1.201 - type: precision_at_1000 value: 0.14100000000000001 - type: precision_at_3 value: 26.977 - type: precision_at_5 value: 17.642 - type: recall_at_1 value: 29.21 - type: recall_at_10 value: 48.244 - type: recall_at_100 value: 60.041 - type: recall_at_1000 value: 70.743 - type: recall_at_3 value: 40.466 - type: recall_at_5 value: 44.105 - task: type: Classification dataset: type: mteb/imdb name: MTEB ImdbClassification config: default split: test revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 metrics: - type: accuracy value: 58.7064 - type: ap value: 55.36326227125519 - type: f1 value: 57.46763115215848 - task: type: Retrieval dataset: type: msmarco name: MTEB MSMARCO config: default split: dev revision: None metrics: - type: map_at_1 value: 15.889000000000001 - type: map_at_10 value: 25.979000000000003 - type: map_at_100 value: 27.21 - type: map_at_1000 value: 27.284000000000002 - type: map_at_3 value: 22.665 - type: map_at_5 value: 24.578 - type: mrr_at_1 value: 16.39 - type: mrr_at_10 value: 26.504 - type: mrr_at_100 value: 27.689999999999998 - type: mrr_at_1000 value: 27.758 - type: mrr_at_3 value: 23.24 - type: mrr_at_5 value: 25.108000000000004 - type: ndcg_at_1 value: 16.39 - type: ndcg_at_10 value: 31.799 - type: ndcg_at_100 value: 38.034 - type: ndcg_at_1000 value: 39.979 - type: ndcg_at_3 value: 25.054 - type: ndcg_at_5 value: 28.463 - type: precision_at_1 value: 16.39 - type: precision_at_10 value: 5.189 - type: precision_at_100 value: 0.835 - type: precision_at_1000 value: 0.1 - type: precision_at_3 value: 10.84 - type: precision_at_5 value: 8.238 - type: recall_at_1 value: 15.889000000000001 - type: recall_at_10 value: 49.739 - type: recall_at_100 value: 79.251 - type: recall_at_1000 value: 94.298 - type: recall_at_3 value: 31.427 - type: recall_at_5 value: 39.623000000000005 - task: type: Classification dataset: type: mteb/mtop_domain name: MTEB MTOPDomainClassification (en) config: en split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 88.81668946648426 - type: f1 value: 88.55200075528438 - task: type: Classification dataset: type: mteb/mtop_intent name: MTEB MTOPIntentClassification (en) config: en split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 58.611491108071135 - type: f1 value: 42.12391403999353 - task: type: Classification dataset: type: mteb/amazon_massive_intent name: MTEB MassiveIntentClassification (en) config: en split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 64.67047747141896 - type: f1 value: 62.88410885922258 - task: type: Classification dataset: type: mteb/amazon_massive_scenario name: MTEB MassiveScenarioClassification (en) config: en split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 71.78547410894419 - type: f1 value: 71.69467869218154 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-p2p name: MTEB MedrxivClusteringP2P config: default split: test revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 metrics: - type: v_measure value: 27.23799937752035 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-s2s name: MTEB MedrxivClusteringS2S config: default split: test revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 metrics: - type: v_measure value: 23.26502601343789 - task: type: Reranking dataset: type: mteb/mind_small name: MTEB MindSmallReranking config: default split: test revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 metrics: - type: map value: 30.680711484149832 - type: mrr value: 31.705059795117307 - task: type: Retrieval dataset: type: nfcorpus name: MTEB NFCorpus config: default split: test revision: None metrics: - type: map_at_1 value: 4.077 - type: map_at_10 value: 8.657 - type: map_at_100 value: 10.753 - type: map_at_1000 value: 11.885 - type: map_at_3 value: 6.5089999999999995 - type: map_at_5 value: 7.405 - type: mrr_at_1 value: 38.7 - type: mrr_at_10 value: 46.065 - type: mrr_at_100 value: 46.772000000000006 - type: mrr_at_1000 value: 46.83 - type: mrr_at_3 value: 44.118 - type: mrr_at_5 value: 45.015 - type: ndcg_at_1 value: 36.997 - type: ndcg_at_10 value: 25.96 - type: ndcg_at_100 value: 23.607 - type: ndcg_at_1000 value: 32.317 - type: ndcg_at_3 value: 31.06 - type: ndcg_at_5 value: 28.921000000000003 - type: precision_at_1 value: 38.7 - type: precision_at_10 value: 19.195 - type: precision_at_100 value: 6.164 - type: precision_at_1000 value: 1.839 - type: precision_at_3 value: 28.999000000000002 - type: precision_at_5 value: 25.014999999999997 - type: recall_at_1 value: 4.077 - type: recall_at_10 value: 11.802 - type: recall_at_100 value: 24.365000000000002 - type: recall_at_1000 value: 55.277 - type: recall_at_3 value: 7.435 - type: recall_at_5 value: 8.713999999999999 - task: type: Retrieval dataset: type: nq name: MTEB NQ config: default split: test revision: None metrics: - type: map_at_1 value: 19.588 - type: map_at_10 value: 32.08 - type: map_at_100 value: 33.32 - type: map_at_1000 value: 33.377 - type: map_at_3 value: 28.166000000000004 - type: map_at_5 value: 30.383 - type: mrr_at_1 value: 22.161 - type: mrr_at_10 value: 34.121 - type: mrr_at_100 value: 35.171 - type: mrr_at_1000 value: 35.214 - type: mrr_at_3 value: 30.692000000000004 - type: mrr_at_5 value: 32.706 - type: ndcg_at_1 value: 22.131999999999998 - type: ndcg_at_10 value: 38.887 - type: ndcg_at_100 value: 44.433 - type: ndcg_at_1000 value: 45.823 - type: ndcg_at_3 value: 31.35 - type: ndcg_at_5 value: 35.144 - type: precision_at_1 value: 22.131999999999998 - type: precision_at_10 value: 6.8629999999999995 - type: precision_at_100 value: 0.993 - type: precision_at_1000 value: 0.11199999999999999 - type: precision_at_3 value: 14.706 - type: precision_at_5 value: 10.972999999999999 - type: recall_at_1 value: 19.588 - type: recall_at_10 value: 57.703 - type: recall_at_100 value: 82.194 - type: recall_at_1000 value: 92.623 - type: recall_at_3 value: 38.012 - type: recall_at_5 value: 46.847 - task: type: Retrieval dataset: type: quora name: MTEB QuoraRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 68.038 - type: map_at_10 value: 81.572 - type: map_at_100 value: 82.25200000000001 - type: map_at_1000 value: 82.27600000000001 - type: map_at_3 value: 78.618 - type: map_at_5 value: 80.449 - type: mrr_at_1 value: 78.31 - type: mrr_at_10 value: 84.98 - type: mrr_at_100 value: 85.122 - type: mrr_at_1000 value: 85.124 - type: mrr_at_3 value: 83.852 - type: mrr_at_5 value: 84.6 - type: ndcg_at_1 value: 78.31 - type: ndcg_at_10 value: 85.693 - type: ndcg_at_100 value: 87.191 - type: ndcg_at_1000 value: 87.386 - type: ndcg_at_3 value: 82.585 - type: ndcg_at_5 value: 84.255 - type: precision_at_1 value: 78.31 - type: precision_at_10 value: 12.986 - type: precision_at_100 value: 1.505 - type: precision_at_1000 value: 0.156 - type: precision_at_3 value: 36.007 - type: precision_at_5 value: 23.735999999999997 - type: recall_at_1 value: 68.038 - type: recall_at_10 value: 93.598 - type: recall_at_100 value: 98.869 - type: recall_at_1000 value: 99.86500000000001 - type: recall_at_3 value: 84.628 - type: recall_at_5 value: 89.316 - task: type: Clustering dataset: type: mteb/reddit-clustering name: MTEB RedditClustering config: default split: test revision: 24640382cdbf8abc73003fb0fa6d111a705499eb metrics: - type: v_measure value: 37.948231664922865 - task: type: Clustering dataset: type: mteb/reddit-clustering-p2p name: MTEB RedditClusteringP2P config: default split: test revision: 282350215ef01743dc01b456c7f5241fa8937f16 metrics: - type: v_measure value: 49.90597913763894 - task: type: Retrieval dataset: type: scidocs name: MTEB SCIDOCS config: default split: test revision: None metrics: - type: map_at_1 value: 3.753 - type: map_at_10 value: 8.915 - type: map_at_100 value: 10.374 - type: map_at_1000 value: 10.612 - type: map_at_3 value: 6.577 - type: map_at_5 value: 7.8 - type: mrr_at_1 value: 18.4 - type: mrr_at_10 value: 27.325 - type: mrr_at_100 value: 28.419 - type: mrr_at_1000 value: 28.494000000000003 - type: mrr_at_3 value: 24.349999999999998 - type: mrr_at_5 value: 26.205000000000002 - type: ndcg_at_1 value: 18.4 - type: ndcg_at_10 value: 15.293000000000001 - type: ndcg_at_100 value: 21.592 - type: ndcg_at_1000 value: 26.473000000000003 - type: ndcg_at_3 value: 14.748 - type: ndcg_at_5 value: 12.98 - type: precision_at_1 value: 18.4 - type: precision_at_10 value: 7.779999999999999 - type: precision_at_100 value: 1.693 - type: precision_at_1000 value: 0.28800000000000003 - type: precision_at_3 value: 13.700000000000001 - type: precision_at_5 value: 11.379999999999999 - type: recall_at_1 value: 3.753 - type: recall_at_10 value: 15.806999999999999 - type: recall_at_100 value: 34.37 - type: recall_at_1000 value: 58.463 - type: recall_at_3 value: 8.338 - type: recall_at_5 value: 11.538 - task: type: STS dataset: type: mteb/sickr-sts name: MTEB SICK-R config: default split: test revision: a6ea5a8cab320b040a23452cc28066d9beae2cee metrics: - type: cos_sim_pearson value: 82.58843987639705 - type: cos_sim_spearman value: 76.33071660715956 - type: euclidean_pearson value: 72.8029921002978 - type: euclidean_spearman value: 69.34534284782808 - type: manhattan_pearson value: 72.49781034973653 - type: manhattan_spearman value: 69.24754112621694 - task: type: STS dataset: type: mteb/sts12-sts name: MTEB STS12 config: default split: test revision: a0d554a64d88156834ff5ae9920b964011b16384 metrics: - type: cos_sim_pearson value: 83.31673079903189 - type: cos_sim_spearman value: 74.27699263517789 - type: euclidean_pearson value: 69.4008910999579 - type: euclidean_spearman value: 59.0716984643048 - type: manhattan_pearson value: 68.87342686919199 - type: manhattan_spearman value: 58.904612865335025 - task: type: STS dataset: type: mteb/sts13-sts name: MTEB STS13 config: default split: test revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca metrics: - type: cos_sim_pearson value: 77.59122302327788 - type: cos_sim_spearman value: 78.55383586979005 - type: euclidean_pearson value: 68.18338642204289 - type: euclidean_spearman value: 68.95092864180276 - type: manhattan_pearson value: 68.08807059822706 - type: manhattan_spearman value: 68.86135938270193 - task: type: STS dataset: type: mteb/sts14-sts name: MTEB STS14 config: default split: test revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 metrics: - type: cos_sim_pearson value: 78.51766841424501 - type: cos_sim_spearman value: 73.84318001499558 - type: euclidean_pearson value: 67.2007138855177 - type: euclidean_spearman value: 63.98672842723766 - type: manhattan_pearson value: 67.17773810895949 - type: manhattan_spearman value: 64.07359154832962 - task: type: STS dataset: type: mteb/sts15-sts name: MTEB STS15 config: default split: test revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 metrics: - type: cos_sim_pearson value: 82.73438541570299 - type: cos_sim_spearman value: 83.71357922283677 - type: euclidean_pearson value: 57.50131347498546 - type: euclidean_spearman value: 57.73623619252132 - type: manhattan_pearson value: 58.082992079000725 - type: manhattan_spearman value: 58.42728201167522 - task: type: STS dataset: type: mteb/sts16-sts name: MTEB STS16 config: default split: test revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 metrics: - type: cos_sim_pearson value: 78.14794654172421 - type: cos_sim_spearman value: 80.025736165043 - type: euclidean_pearson value: 65.87773913985473 - type: euclidean_spearman value: 66.69337751784794 - type: manhattan_pearson value: 66.01039761004415 - type: manhattan_spearman value: 66.89215027952318 - task: type: STS dataset: type: mteb/sts17-crosslingual-sts name: MTEB STS17 (en-en) config: en-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 87.10554507136152 - type: cos_sim_spearman value: 87.4898082140765 - type: euclidean_pearson value: 72.19391114541367 - type: euclidean_spearman value: 70.36647944993783 - type: manhattan_pearson value: 72.18680758133698 - type: manhattan_spearman value: 70.3871215447305 - task: type: STS dataset: type: mteb/sts22-crosslingual-sts name: MTEB STS22 (en) config: en split: test revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 metrics: - type: cos_sim_pearson value: 64.54868111501618 - type: cos_sim_spearman value: 64.25173617448473 - type: euclidean_pearson value: 39.116088900637116 - type: euclidean_spearman value: 53.300772929884 - type: manhattan_pearson value: 38.3844195287959 - type: manhattan_spearman value: 52.846675312001246 - task: type: STS dataset: type: mteb/stsbenchmark-sts name: MTEB STSBenchmark config: default split: test revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 metrics: - type: cos_sim_pearson value: 80.04396610550214 - type: cos_sim_spearman value: 79.19504854997832 - type: euclidean_pearson value: 66.3284657637072 - type: euclidean_spearman value: 63.69531796729492 - type: manhattan_pearson value: 66.82324081038026 - type: manhattan_spearman value: 64.18254512904923 - task: type: Reranking dataset: type: mteb/scidocs-reranking name: MTEB SciDocsRR config: default split: test revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab metrics: - type: map value: 74.16264051781705 - type: mrr value: 91.80864796060874 - task: type: Retrieval dataset: type: scifact name: MTEB SciFact config: default split: test revision: None metrics: - type: map_at_1 value: 38.983000000000004 - type: map_at_10 value: 47.858000000000004 - type: map_at_100 value: 48.695 - type: map_at_1000 value: 48.752 - type: map_at_3 value: 45.444 - type: map_at_5 value: 46.906 - type: mrr_at_1 value: 41.333 - type: mrr_at_10 value: 49.935 - type: mrr_at_100 value: 50.51 - type: mrr_at_1000 value: 50.55500000000001 - type: mrr_at_3 value: 47.833 - type: mrr_at_5 value: 49.117 - type: ndcg_at_1 value: 41.333 - type: ndcg_at_10 value: 52.398999999999994 - type: ndcg_at_100 value: 56.196 - type: ndcg_at_1000 value: 57.838 - type: ndcg_at_3 value: 47.987 - type: ndcg_at_5 value: 50.356 - type: precision_at_1 value: 41.333 - type: precision_at_10 value: 7.167 - type: precision_at_100 value: 0.9299999999999999 - type: precision_at_1000 value: 0.108 - type: precision_at_3 value: 19.0 - type: precision_at_5 value: 12.8 - type: recall_at_1 value: 38.983000000000004 - type: recall_at_10 value: 64.183 - type: recall_at_100 value: 82.02199999999999 - type: recall_at_1000 value: 95.167 - type: recall_at_3 value: 52.383 - type: recall_at_5 value: 58.411 - task: type: PairClassification dataset: type: mteb/sprintduplicatequestions-pairclassification name: MTEB SprintDuplicateQuestions config: default split: test revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 metrics: - type: cos_sim_accuracy value: 99.8019801980198 - type: cos_sim_ap value: 94.9287554635848 - type: cos_sim_f1 value: 89.83739837398375 - type: cos_sim_precision value: 91.32231404958677 - type: cos_sim_recall value: 88.4 - type: dot_accuracy value: 99.23762376237623 - type: dot_ap value: 55.22534191245801 - type: dot_f1 value: 54.054054054054056 - type: dot_precision value: 55.15088449531738 - type: dot_recall value: 53.0 - type: euclidean_accuracy value: 99.6108910891089 - type: euclidean_ap value: 82.5195111329438 - type: euclidean_f1 value: 78.2847718526663 - type: euclidean_precision value: 86.93528693528694 - type: euclidean_recall value: 71.2 - type: manhattan_accuracy value: 99.5970297029703 - type: manhattan_ap value: 81.96876777875492 - type: manhattan_f1 value: 77.33773377337734 - type: manhattan_precision value: 85.94132029339853 - type: manhattan_recall value: 70.3 - type: max_accuracy value: 99.8019801980198 - type: max_ap value: 94.9287554635848 - type: max_f1 value: 89.83739837398375 - task: type: Clustering dataset: type: mteb/stackexchange-clustering name: MTEB StackExchangeClustering config: default split: test revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 metrics: - type: v_measure value: 46.34997003954114 - task: type: Clustering dataset: type: mteb/stackexchange-clustering-p2p name: MTEB StackExchangeClusteringP2P config: default split: test revision: 815ca46b2622cec33ccafc3735d572c266efdb44 metrics: - type: v_measure value: 31.462336020554893 - task: type: Reranking dataset: type: mteb/stackoverflowdupquestions-reranking name: MTEB StackOverflowDupQuestions config: default split: test revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 metrics: - type: map value: 47.1757817459526 - type: mrr value: 47.941057104660054 - task: type: Summarization dataset: type: mteb/summeval name: MTEB SummEval config: default split: test revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c metrics: - type: cos_sim_pearson value: 30.56106249068471 - type: cos_sim_spearman value: 31.24613190558528 - type: dot_pearson value: 20.486610035794257 - type: dot_spearman value: 23.115667545894546 - task: type: Retrieval dataset: type: trec-covid name: MTEB TRECCOVID config: default split: test revision: None metrics: - type: map_at_1 value: 0.182 - type: map_at_10 value: 1.155 - type: map_at_100 value: 5.118 - type: map_at_1000 value: 11.827 - type: map_at_3 value: 0.482 - type: map_at_5 value: 0.712 - type: mrr_at_1 value: 70.0 - type: mrr_at_10 value: 79.483 - type: mrr_at_100 value: 79.637 - type: mrr_at_1000 value: 79.637 - type: mrr_at_3 value: 77.667 - type: mrr_at_5 value: 78.567 - type: ndcg_at_1 value: 63.0 - type: ndcg_at_10 value: 52.303 - type: ndcg_at_100 value: 37.361 - type: ndcg_at_1000 value: 32.84 - type: ndcg_at_3 value: 58.274 - type: ndcg_at_5 value: 55.601 - type: precision_at_1 value: 70.0 - type: precision_at_10 value: 55.60000000000001 - type: precision_at_100 value: 37.96 - type: precision_at_1000 value: 14.738000000000001 - type: precision_at_3 value: 62.666999999999994 - type: precision_at_5 value: 60.0 - type: recall_at_1 value: 0.182 - type: recall_at_10 value: 1.4120000000000001 - type: recall_at_100 value: 8.533 - type: recall_at_1000 value: 30.572 - type: recall_at_3 value: 0.5309999999999999 - type: recall_at_5 value: 0.814 - task: type: Retrieval dataset: type: webis-touche2020 name: MTEB Touche2020 config: default split: test revision: None metrics: - type: map_at_1 value: 1.385 - type: map_at_10 value: 7.185999999999999 - type: map_at_100 value: 11.642 - type: map_at_1000 value: 12.953000000000001 - type: map_at_3 value: 3.496 - type: map_at_5 value: 4.82 - type: mrr_at_1 value: 16.326999999999998 - type: mrr_at_10 value: 29.461 - type: mrr_at_100 value: 31.436999999999998 - type: mrr_at_1000 value: 31.436999999999998 - type: mrr_at_3 value: 24.490000000000002 - type: mrr_at_5 value: 27.857 - type: ndcg_at_1 value: 14.285999999999998 - type: ndcg_at_10 value: 16.672 - type: ndcg_at_100 value: 28.691 - type: ndcg_at_1000 value: 39.817 - type: ndcg_at_3 value: 15.277 - type: ndcg_at_5 value: 15.823 - type: precision_at_1 value: 16.326999999999998 - type: precision_at_10 value: 15.509999999999998 - type: precision_at_100 value: 6.49 - type: precision_at_1000 value: 1.4080000000000001 - type: precision_at_3 value: 16.326999999999998 - type: precision_at_5 value: 16.735 - type: recall_at_1 value: 1.385 - type: recall_at_10 value: 12.586 - type: recall_at_100 value: 40.765 - type: recall_at_1000 value: 75.198 - type: recall_at_3 value: 4.326 - type: recall_at_5 value: 7.074999999999999 - task: type: Classification dataset: type: mteb/toxic_conversations_50k name: MTEB ToxicConversationsClassification config: default split: test revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c metrics: - type: accuracy value: 59.4402 - type: ap value: 10.16922814263879 - type: f1 value: 45.374485104940476 - task: type: Classification dataset: type: mteb/tweet_sentiment_extraction name: MTEB TweetSentimentExtractionClassification config: default split: test revision: d604517c81ca91fe16a244d1248fc021f9ecee7a metrics: - type: accuracy value: 54.25863044708545 - type: f1 value: 54.20154252609619 - task: type: Clustering dataset: type: mteb/twentynewsgroups-clustering name: MTEB TwentyNewsgroupsClustering config: default split: test revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 metrics: - type: v_measure value: 34.3883169293051 - task: type: PairClassification dataset: type: mteb/twittersemeval2015-pairclassification name: MTEB TwitterSemEval2015 config: default split: test revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 metrics: - type: cos_sim_accuracy value: 81.76670441676104 - type: cos_sim_ap value: 59.29878710961347 - type: cos_sim_f1 value: 57.33284971587474 - type: cos_sim_precision value: 52.9122963624191 - type: cos_sim_recall value: 62.559366754617415 - type: dot_accuracy value: 77.52279907015557 - type: dot_ap value: 34.17588904643467 - type: dot_f1 value: 41.063567529494634 - type: dot_precision value: 30.813953488372093 - type: dot_recall value: 61.53034300791557 - type: euclidean_accuracy value: 80.61631996185254 - type: euclidean_ap value: 54.00362361479352 - type: euclidean_f1 value: 53.99111751290361 - type: euclidean_precision value: 49.52653600528518 - type: euclidean_recall value: 59.340369393139845 - type: manhattan_accuracy value: 80.65208320915539 - type: manhattan_ap value: 54.18329507159467 - type: manhattan_f1 value: 53.85550960836779 - type: manhattan_precision value: 49.954873646209386 - type: manhattan_recall value: 58.41688654353562 - type: max_accuracy value: 81.76670441676104 - type: max_ap value: 59.29878710961347 - type: max_f1 value: 57.33284971587474 - task: type: PairClassification dataset: type: mteb/twitterurlcorpus-pairclassification name: MTEB TwitterURLCorpus config: default split: test revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf metrics: - type: cos_sim_accuracy value: 87.99433383785463 - type: cos_sim_ap value: 83.43513915159009 - type: cos_sim_f1 value: 76.3906784964842 - type: cos_sim_precision value: 73.19223985890653 - type: cos_sim_recall value: 79.88142901139513 - type: dot_accuracy value: 81.96142352621571 - type: dot_ap value: 67.78764755689359 - type: dot_f1 value: 64.42823356983445 - type: dot_precision value: 56.77801913931779 - type: dot_recall value: 74.46104096088698 - type: euclidean_accuracy value: 81.9478402607987 - type: euclidean_ap value: 67.13958457373279 - type: euclidean_f1 value: 60.45118343195266 - type: euclidean_precision value: 58.1625391403359 - type: euclidean_recall value: 62.92731752386819 - type: manhattan_accuracy value: 82.01769705437188 - type: manhattan_ap value: 67.24709477497046 - type: manhattan_f1 value: 60.4103846436714 - type: manhattan_precision value: 57.82063916654935 - type: manhattan_recall value: 63.24299353249153 - type: max_accuracy value: 87.99433383785463 - type: max_ap value: 83.43513915159009 - type: max_f1 value: 76.3906784964842 ---

Finetuner logo: Finetuner helps you to create experiments in order to improve embeddings on search tasks. It accompanies you to deliver the last mile of performance-tuning for neural search applications.

The text embedding suite trained by Jina AI, Finetuner team.

## Intented Usage & Model Info `jina-embedding-s-en-v1` is a language model that has been trained using Jina AI's Linnaeus-Clean dataset. This dataset consists of 380 million pairs of sentences, which include both query-document pairs. These pairs were obtained from various domains and were carefully selected through a thorough cleaning process. The Linnaeus-Full dataset, from which the Linnaeus-Clean dataset is derived, originally contained 1.6 billion sentence pairs. The model has a range of use cases, including information retrieval, semantic textual similarity, text reranking, and more. With a compact size of just 35 million parameters, the model enables lightning-fast inference while still delivering impressive performance. Additionally, we provide the following options: - `jina-embedding-s-en-v1`: 35 million parameters **(you are here)**. - `jina-embedding-b-en-v1`: 110 million parameters. - `jina-embedding-l-en-v1`: 330 million parameters. - `jina-embedding-1b-en-v1`: 1.2 billion parameters, 10* bert-base size (soon). - `jina-embedding-6b-en-v1`: 6 billion parameters 30* bert-base size(soon). ## Data & Parameters More info will be released together with the technique report. ## Metrics We compared the model against `all-minilm-l6-v2`/`all-mpnet-base-v2` from sbert and `text-embeddings-ada-002` from OpenAI: |Name|param |context| |------------------------------|-----|------| |all-minilm-l6-v2|33m |128| |all-mpnet-base-v2 |110m |128| |ada-embedding-002|Unknown/OpenAI API |8192| |jina-embedding-s-en-v1|35m |512| |jina-embedding-b-en-v1|110m |512| |jina-embedding-l-en-v1|330m |512| |Name|STS12|STS13|STS14|STS15|STS16|STS17|TRECOVID|Quora|SciFact| |------------------------------|-----|-----|-----|-----|-----|-----|--------|-----|-----| |all-minilm-l6-v2|0.724|0.806|0.756|0.854|0.79 |0.876|0.473 |0.876|0.645 | |all-mpnet-base-v2|0.726|0.835|**0.78** |0.857|0.8 |**0.906**|0.513 |0.875|0.656 | |ada-embedding-002|0.698|0.833|0.761|0.861|**0.86** |0.903|**0.685** |0.876|**0.726** | |jina-embedding-s-en-v1|0.742|0.786|0.738|0.837|0.80|0.875|0.543 |0.857|0.608 | |jina-embedding-b-en-v1|**0.751**|0.809|0.761|0.856|0.812|0.89|0.601 |0.876|0.645 | |jina-embedding-l-en-v1|0.739|**0.844**|0.778|**0.863**|0.829|0.896|0.526 |**0.882**|0.652 | *update: we have updated the checkpoints for small/base model, re-evaluation of large model and BEIR is running in progress.* ## Usage Use with Jina AI Finetuner ```python !pip install finetuner import finetuner model = finetuner.build_model('jinaai/jina-embedding-s-en-v1') embeddings = finetuner.encode( model=model, data=['how is the weather today', 'What is the current weather like today?'] ) print(finetuner.cos_sim(embeddings[0], embeddings[1])) ``` Use directly with sentence-transformers: ```python from sentence_transformers import SentenceTransformer from sentence_transformers.util import cos_sim sentences = ['how is the weather today', 'What is the current weather like today?'] model = SentenceTransformer('jinaai/jina-embedding-s-en-v1') embeddings = model.encode(sentences) print(cos_sim(embeddings[0], embeddings[1])) ``` ## Fine-tuning Please consider [Finetuner](https://github.com/jina-ai/finetuner). ## Plans 1. The development of `jina-embedding-s-en-v2` is currently underway with two main objectives: improving performance and increasing the maximum sequence length. 2. We are currently working on a bilingual embedding model that combines English and X language. The upcoming model will be called `jina-embedding-s/b/l-de-v1`. ## Contact Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.