--- tags: - sentence-transformers - feature-extraction - sentence-similarity - mteb language: - es - en inference: false license: apache-2.0 model-index: - name: jina-embeddings-v2-base-es results: - task: type: Classification dataset: type: mteb/amazon_counterfactual name: MTEB AmazonCounterfactualClassification (en) config: en split: test revision: e8379541af4e31359cca9fbcf4b00f2671dba205 metrics: - type: accuracy value: 74.25373134328358 - type: ap value: 37.05201236793268 - type: f1 value: 68.16770391201077 - task: type: Classification dataset: type: mteb/amazon_polarity name: MTEB AmazonPolarityClassification config: default split: test revision: e2d317d38cd51312af73b3d32a06d1a08b442046 metrics: - type: accuracy value: 78.30885 - type: ap value: 73.01622441156408 - type: f1 value: 78.20769284466313 - task: type: Classification dataset: type: mteb/amazon_reviews_multi name: MTEB AmazonReviewsClassification (en) config: en split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 38.324 - type: f1 value: 37.89543008761673 - task: type: Classification dataset: type: mteb/amazon_reviews_multi name: MTEB AmazonReviewsClassification (es) config: es split: test revision: 1399c76144fd37290681b995c656ef9b2e06e26d metrics: - type: accuracy value: 38.678000000000004 - type: f1 value: 38.122639506976 - task: type: Retrieval dataset: type: arguana name: MTEB ArguAna config: default split: test revision: None metrics: - type: map_at_1 value: 23.968999999999998 - type: map_at_10 value: 40.691 - type: map_at_100 value: 41.713 - type: map_at_1000 value: 41.719 - type: map_at_3 value: 35.42 - type: map_at_5 value: 38.442 - type: mrr_at_1 value: 24.395 - type: mrr_at_10 value: 40.853 - type: mrr_at_100 value: 41.869 - type: mrr_at_1000 value: 41.874 - type: mrr_at_3 value: 35.68 - type: mrr_at_5 value: 38.572 - type: ndcg_at_1 value: 23.968999999999998 - type: ndcg_at_10 value: 50.129999999999995 - type: ndcg_at_100 value: 54.364000000000004 - type: ndcg_at_1000 value: 54.494 - type: ndcg_at_3 value: 39.231 - type: ndcg_at_5 value: 44.694 - type: precision_at_1 value: 23.968999999999998 - type: precision_at_10 value: 8.036999999999999 - type: precision_at_100 value: 0.9860000000000001 - type: precision_at_1000 value: 0.1 - type: precision_at_3 value: 16.761 - type: precision_at_5 value: 12.717 - type: recall_at_1 value: 23.968999999999998 - type: recall_at_10 value: 80.36999999999999 - type: recall_at_100 value: 98.578 - type: recall_at_1000 value: 99.57300000000001 - type: recall_at_3 value: 50.28399999999999 - type: recall_at_5 value: 63.585 - task: type: Clustering dataset: type: mteb/arxiv-clustering-p2p name: MTEB ArxivClusteringP2P config: default split: test revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d metrics: - type: v_measure value: 41.54886683150053 - task: type: Clustering dataset: type: mteb/arxiv-clustering-s2s name: MTEB ArxivClusteringS2S config: default split: test revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 metrics: - type: v_measure value: 32.186028697637234 - task: type: Reranking dataset: type: mteb/askubuntudupquestions-reranking name: MTEB AskUbuntuDupQuestions config: default split: test revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 metrics: - type: map value: 61.19432643698725 - type: mrr value: 75.28646176845622 - task: type: STS dataset: type: mteb/biosses-sts name: MTEB BIOSSES config: default split: test revision: d3fb88f8f02e40887cd149695127462bbcf29b4a metrics: - type: cos_sim_pearson value: 86.3828259381228 - type: cos_sim_spearman value: 83.04647058342209 - type: euclidean_pearson value: 84.02895346096244 - type: euclidean_spearman value: 82.34524978635342 - type: manhattan_pearson value: 84.35030723233426 - type: manhattan_spearman value: 83.17177464337936 - task: type: Classification dataset: type: mteb/banking77 name: MTEB Banking77Classification config: default split: test revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 metrics: - type: accuracy value: 85.25649350649351 - type: f1 value: 85.22320474023192 - task: type: Clustering dataset: type: jinaai/big-patent-clustering name: MTEB BigPatentClustering config: default split: test revision: 62d5330920bca426ce9d3c76ea914f15fc83e891 metrics: - type: v_measure value: 20.42929408254094 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-p2p name: MTEB BiorxivClusteringP2P config: default split: test revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 metrics: - type: v_measure value: 35.165318177498136 - task: type: Clustering dataset: type: mteb/biorxiv-clustering-s2s name: MTEB BiorxivClusteringS2S config: default split: test revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 metrics: - type: v_measure value: 28.89030154229562 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackAndroidRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 30.119 - type: map_at_10 value: 42.092 - type: map_at_100 value: 43.506 - type: map_at_1000 value: 43.631 - type: map_at_3 value: 38.373000000000005 - type: map_at_5 value: 40.501 - type: mrr_at_1 value: 38.196999999999996 - type: mrr_at_10 value: 48.237 - type: mrr_at_100 value: 48.914 - type: mrr_at_1000 value: 48.959 - type: mrr_at_3 value: 45.279 - type: mrr_at_5 value: 47.11 - type: ndcg_at_1 value: 38.196999999999996 - type: ndcg_at_10 value: 48.849 - type: ndcg_at_100 value: 53.713 - type: ndcg_at_1000 value: 55.678000000000004 - type: ndcg_at_3 value: 43.546 - type: ndcg_at_5 value: 46.009 - type: precision_at_1 value: 38.196999999999996 - type: precision_at_10 value: 9.642000000000001 - type: precision_at_100 value: 1.5190000000000001 - type: precision_at_1000 value: 0.199 - type: precision_at_3 value: 21.65 - type: precision_at_5 value: 15.708 - type: recall_at_1 value: 30.119 - type: recall_at_10 value: 61.788 - type: recall_at_100 value: 82.14399999999999 - type: recall_at_1000 value: 95.003 - type: recall_at_3 value: 45.772 - type: recall_at_5 value: 53.04600000000001 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackEnglishRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 28.979 - type: map_at_10 value: 37.785000000000004 - type: map_at_100 value: 38.945 - type: map_at_1000 value: 39.071 - type: map_at_3 value: 35.083999999999996 - type: map_at_5 value: 36.571999999999996 - type: mrr_at_1 value: 36.242000000000004 - type: mrr_at_10 value: 43.552 - type: mrr_at_100 value: 44.228 - type: mrr_at_1000 value: 44.275999999999996 - type: mrr_at_3 value: 41.359 - type: mrr_at_5 value: 42.598 - type: ndcg_at_1 value: 36.242000000000004 - type: ndcg_at_10 value: 42.94 - type: ndcg_at_100 value: 47.343 - type: ndcg_at_1000 value: 49.538 - type: ndcg_at_3 value: 39.086999999999996 - type: ndcg_at_5 value: 40.781 - type: precision_at_1 value: 36.242000000000004 - type: precision_at_10 value: 7.954999999999999 - type: precision_at_100 value: 1.303 - type: precision_at_1000 value: 0.178 - type: precision_at_3 value: 18.556 - type: precision_at_5 value: 13.145999999999999 - type: recall_at_1 value: 28.979 - type: recall_at_10 value: 51.835 - type: recall_at_100 value: 70.47 - type: recall_at_1000 value: 84.68299999999999 - type: recall_at_3 value: 40.410000000000004 - type: recall_at_5 value: 45.189 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGamingRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 37.878 - type: map_at_10 value: 49.903 - type: map_at_100 value: 50.797000000000004 - type: map_at_1000 value: 50.858000000000004 - type: map_at_3 value: 46.526 - type: map_at_5 value: 48.615 - type: mrr_at_1 value: 43.135 - type: mrr_at_10 value: 53.067 - type: mrr_at_100 value: 53.668000000000006 - type: mrr_at_1000 value: 53.698 - type: mrr_at_3 value: 50.449 - type: mrr_at_5 value: 52.117000000000004 - type: ndcg_at_1 value: 43.135 - type: ndcg_at_10 value: 55.641 - type: ndcg_at_100 value: 59.427 - type: ndcg_at_1000 value: 60.655 - type: ndcg_at_3 value: 49.969 - type: ndcg_at_5 value: 53.075 - type: precision_at_1 value: 43.135 - type: precision_at_10 value: 8.997 - type: precision_at_100 value: 1.1809999999999998 - type: precision_at_1000 value: 0.133 - type: precision_at_3 value: 22.215 - type: precision_at_5 value: 15.586 - type: recall_at_1 value: 37.878 - type: recall_at_10 value: 69.405 - type: recall_at_100 value: 86.262 - type: recall_at_1000 value: 95.012 - type: recall_at_3 value: 54.458 - type: recall_at_5 value: 61.965 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackGisRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 24.853 - type: map_at_10 value: 32.402 - type: map_at_100 value: 33.417 - type: map_at_1000 value: 33.498 - type: map_at_3 value: 30.024 - type: map_at_5 value: 31.407 - type: mrr_at_1 value: 26.667 - type: mrr_at_10 value: 34.399 - type: mrr_at_100 value: 35.284 - type: mrr_at_1000 value: 35.345 - type: mrr_at_3 value: 32.109 - type: mrr_at_5 value: 33.375 - type: ndcg_at_1 value: 26.667 - type: ndcg_at_10 value: 36.854 - type: ndcg_at_100 value: 42.196 - type: ndcg_at_1000 value: 44.303 - type: ndcg_at_3 value: 32.186 - type: ndcg_at_5 value: 34.512 - type: precision_at_1 value: 26.667 - type: precision_at_10 value: 5.559 - type: precision_at_100 value: 0.88 - type: precision_at_1000 value: 0.109 - type: precision_at_3 value: 13.333 - type: precision_at_5 value: 9.379 - type: recall_at_1 value: 24.853 - type: recall_at_10 value: 48.636 - type: recall_at_100 value: 73.926 - type: recall_at_1000 value: 89.94 - type: recall_at_3 value: 36.266 - type: recall_at_5 value: 41.723 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackMathematicaRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 14.963999999999999 - type: map_at_10 value: 22.591 - type: map_at_100 value: 23.735999999999997 - type: map_at_1000 value: 23.868000000000002 - type: map_at_3 value: 20.093 - type: map_at_5 value: 21.499 - type: mrr_at_1 value: 18.407999999999998 - type: mrr_at_10 value: 26.863 - type: mrr_at_100 value: 27.87 - type: mrr_at_1000 value: 27.947 - type: mrr_at_3 value: 24.254 - type: mrr_at_5 value: 25.784000000000002 - type: ndcg_at_1 value: 18.407999999999998 - type: ndcg_at_10 value: 27.549 - type: ndcg_at_100 value: 33.188 - type: ndcg_at_1000 value: 36.312 - type: ndcg_at_3 value: 22.862 - type: ndcg_at_5 value: 25.130999999999997 - type: precision_at_1 value: 18.407999999999998 - type: precision_at_10 value: 5.087 - type: precision_at_100 value: 0.923 - type: precision_at_1000 value: 0.133 - type: precision_at_3 value: 10.987 - type: precision_at_5 value: 8.209 - type: recall_at_1 value: 14.963999999999999 - type: recall_at_10 value: 38.673 - type: recall_at_100 value: 63.224999999999994 - type: recall_at_1000 value: 85.443 - type: recall_at_3 value: 25.840000000000003 - type: recall_at_5 value: 31.503999999999998 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackPhysicsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 27.861000000000004 - type: map_at_10 value: 37.562 - type: map_at_100 value: 38.906 - type: map_at_1000 value: 39.021 - type: map_at_3 value: 34.743 - type: map_at_5 value: 36.168 - type: mrr_at_1 value: 34.455999999999996 - type: mrr_at_10 value: 43.428 - type: mrr_at_100 value: 44.228 - type: mrr_at_1000 value: 44.278 - type: mrr_at_3 value: 41.001 - type: mrr_at_5 value: 42.315000000000005 - type: ndcg_at_1 value: 34.455999999999996 - type: ndcg_at_10 value: 43.477 - type: ndcg_at_100 value: 48.953 - type: ndcg_at_1000 value: 51.19200000000001 - type: ndcg_at_3 value: 38.799 - type: ndcg_at_5 value: 40.743 - type: precision_at_1 value: 34.455999999999996 - type: precision_at_10 value: 7.902000000000001 - type: precision_at_100 value: 1.244 - type: precision_at_1000 value: 0.161 - type: precision_at_3 value: 18.511 - type: precision_at_5 value: 12.859000000000002 - type: recall_at_1 value: 27.861000000000004 - type: recall_at_10 value: 55.36 - type: recall_at_100 value: 78.384 - type: recall_at_1000 value: 93.447 - type: recall_at_3 value: 41.926 - type: recall_at_5 value: 47.257 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackProgrammersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 26.375 - type: map_at_10 value: 35.571000000000005 - type: map_at_100 value: 36.785000000000004 - type: map_at_1000 value: 36.905 - type: map_at_3 value: 32.49 - type: map_at_5 value: 34.123999999999995 - type: mrr_at_1 value: 32.647999999999996 - type: mrr_at_10 value: 40.598 - type: mrr_at_100 value: 41.484 - type: mrr_at_1000 value: 41.546 - type: mrr_at_3 value: 37.9 - type: mrr_at_5 value: 39.401 - type: ndcg_at_1 value: 32.647999999999996 - type: ndcg_at_10 value: 41.026 - type: ndcg_at_100 value: 46.365 - type: ndcg_at_1000 value: 48.876 - type: ndcg_at_3 value: 35.843 - type: ndcg_at_5 value: 38.118 - type: precision_at_1 value: 32.647999999999996 - type: precision_at_10 value: 7.443 - type: precision_at_100 value: 1.18 - type: precision_at_1000 value: 0.158 - type: precision_at_3 value: 16.819 - type: precision_at_5 value: 11.985999999999999 - type: recall_at_1 value: 26.375 - type: recall_at_10 value: 52.471000000000004 - type: recall_at_100 value: 75.354 - type: recall_at_1000 value: 92.35 - type: recall_at_3 value: 37.893 - type: recall_at_5 value: 43.935 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 25.012666666666668 - type: map_at_10 value: 33.685833333333335 - type: map_at_100 value: 34.849250000000005 - type: map_at_1000 value: 34.970083333333335 - type: map_at_3 value: 31.065083333333334 - type: map_at_5 value: 32.494416666666666 - type: mrr_at_1 value: 29.772666666666662 - type: mrr_at_10 value: 37.824666666666666 - type: mrr_at_100 value: 38.66741666666666 - type: mrr_at_1000 value: 38.72916666666666 - type: mrr_at_3 value: 35.54575 - type: mrr_at_5 value: 36.81524999999999 - type: ndcg_at_1 value: 29.772666666666662 - type: ndcg_at_10 value: 38.78241666666666 - type: ndcg_at_100 value: 43.84591666666667 - type: ndcg_at_1000 value: 46.275416666666665 - type: ndcg_at_3 value: 34.33416666666667 - type: ndcg_at_5 value: 36.345166666666664 - type: precision_at_1 value: 29.772666666666662 - type: precision_at_10 value: 6.794916666666667 - type: precision_at_100 value: 1.106416666666667 - type: precision_at_1000 value: 0.15033333333333335 - type: precision_at_3 value: 15.815083333333336 - type: precision_at_5 value: 11.184166666666664 - type: recall_at_1 value: 25.012666666666668 - type: recall_at_10 value: 49.748500000000014 - type: recall_at_100 value: 72.11341666666667 - type: recall_at_1000 value: 89.141 - type: recall_at_3 value: 37.242999999999995 - type: recall_at_5 value: 42.49033333333333 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackStatsRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 23.177 - type: map_at_10 value: 29.310000000000002 - type: map_at_100 value: 30.188 - type: map_at_1000 value: 30.29 - type: map_at_3 value: 27.356 - type: map_at_5 value: 28.410999999999998 - type: mrr_at_1 value: 26.074 - type: mrr_at_10 value: 32.002 - type: mrr_at_100 value: 32.838 - type: mrr_at_1000 value: 32.909 - type: mrr_at_3 value: 30.317 - type: mrr_at_5 value: 31.222 - type: ndcg_at_1 value: 26.074 - type: ndcg_at_10 value: 32.975 - type: ndcg_at_100 value: 37.621 - type: ndcg_at_1000 value: 40.253 - type: ndcg_at_3 value: 29.452 - type: ndcg_at_5 value: 31.020999999999997 - type: precision_at_1 value: 26.074 - type: precision_at_10 value: 5.077 - type: precision_at_100 value: 0.8049999999999999 - type: precision_at_1000 value: 0.11100000000000002 - type: precision_at_3 value: 12.526000000000002 - type: precision_at_5 value: 8.588999999999999 - type: recall_at_1 value: 23.177 - type: recall_at_10 value: 41.613 - type: recall_at_100 value: 63.287000000000006 - type: recall_at_1000 value: 83.013 - type: recall_at_3 value: 31.783 - type: recall_at_5 value: 35.769 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackTexRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 15.856 - type: map_at_10 value: 22.651 - type: map_at_100 value: 23.649 - type: map_at_1000 value: 23.783 - type: map_at_3 value: 20.591 - type: map_at_5 value: 21.684 - type: mrr_at_1 value: 19.408 - type: mrr_at_10 value: 26.51 - type: mrr_at_100 value: 27.356 - type: mrr_at_1000 value: 27.439999999999998 - type: mrr_at_3 value: 24.547 - type: mrr_at_5 value: 25.562 - type: ndcg_at_1 value: 19.408 - type: ndcg_at_10 value: 27.072000000000003 - type: ndcg_at_100 value: 31.980999999999998 - type: ndcg_at_1000 value: 35.167 - type: ndcg_at_3 value: 23.338 - type: ndcg_at_5 value: 24.94 - type: precision_at_1 value: 19.408 - type: precision_at_10 value: 4.9590000000000005 - type: precision_at_100 value: 0.8710000000000001 - type: precision_at_1000 value: 0.132 - type: precision_at_3 value: 11.138 - type: precision_at_5 value: 7.949000000000001 - type: recall_at_1 value: 15.856 - type: recall_at_10 value: 36.578 - type: recall_at_100 value: 58.89 - type: recall_at_1000 value: 81.743 - type: recall_at_3 value: 25.94 - type: recall_at_5 value: 30.153999999999996 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackUnixRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 25.892 - type: map_at_10 value: 33.899 - type: map_at_100 value: 34.955000000000005 - type: map_at_1000 value: 35.066 - type: map_at_3 value: 31.41 - type: map_at_5 value: 32.669 - type: mrr_at_1 value: 30.224 - type: mrr_at_10 value: 37.936 - type: mrr_at_100 value: 38.777 - type: mrr_at_1000 value: 38.85 - type: mrr_at_3 value: 35.821 - type: mrr_at_5 value: 36.894 - type: ndcg_at_1 value: 30.224 - type: ndcg_at_10 value: 38.766 - type: ndcg_at_100 value: 43.806 - type: ndcg_at_1000 value: 46.373999999999995 - type: ndcg_at_3 value: 34.325 - type: ndcg_at_5 value: 36.096000000000004 - type: precision_at_1 value: 30.224 - type: precision_at_10 value: 6.446000000000001 - type: precision_at_100 value: 1.0 - type: precision_at_1000 value: 0.133 - type: precision_at_3 value: 15.392 - type: precision_at_5 value: 10.671999999999999 - type: recall_at_1 value: 25.892 - type: recall_at_10 value: 49.573 - type: recall_at_100 value: 71.885 - type: recall_at_1000 value: 89.912 - type: recall_at_3 value: 37.226 - type: recall_at_5 value: 41.74 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWebmastersRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 23.915 - type: map_at_10 value: 33.613 - type: map_at_100 value: 35.333999999999996 - type: map_at_1000 value: 35.563 - type: map_at_3 value: 31.203999999999997 - type: map_at_5 value: 32.479 - type: mrr_at_1 value: 29.447000000000003 - type: mrr_at_10 value: 38.440000000000005 - type: mrr_at_100 value: 39.459 - type: mrr_at_1000 value: 39.513999999999996 - type: mrr_at_3 value: 36.495 - type: mrr_at_5 value: 37.592 - type: ndcg_at_1 value: 29.447000000000003 - type: ndcg_at_10 value: 39.341 - type: ndcg_at_100 value: 45.382 - type: ndcg_at_1000 value: 47.921 - type: ndcg_at_3 value: 35.671 - type: ndcg_at_5 value: 37.299 - type: precision_at_1 value: 29.447000000000003 - type: precision_at_10 value: 7.648000000000001 - type: precision_at_100 value: 1.567 - type: precision_at_1000 value: 0.241 - type: precision_at_3 value: 17.194000000000003 - type: precision_at_5 value: 12.253 - type: recall_at_1 value: 23.915 - type: recall_at_10 value: 49.491 - type: recall_at_100 value: 76.483 - type: recall_at_1000 value: 92.674 - type: recall_at_3 value: 38.878 - type: recall_at_5 value: 43.492 - task: type: Retrieval dataset: type: BeIR/cqadupstack name: MTEB CQADupstackWordpressRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 20.283 - type: map_at_10 value: 26.851000000000003 - type: map_at_100 value: 27.973 - type: map_at_1000 value: 28.087 - type: map_at_3 value: 24.887 - type: map_at_5 value: 25.804 - type: mrr_at_1 value: 22.366 - type: mrr_at_10 value: 28.864 - type: mrr_at_100 value: 29.903000000000002 - type: mrr_at_1000 value: 29.988 - type: mrr_at_3 value: 27.017999999999997 - type: mrr_at_5 value: 27.813 - type: ndcg_at_1 value: 22.366 - type: ndcg_at_10 value: 30.898999999999997 - type: ndcg_at_100 value: 36.176 - type: ndcg_at_1000 value: 39.036 - type: ndcg_at_3 value: 26.932000000000002 - type: ndcg_at_5 value: 28.416999999999998 - type: precision_at_1 value: 22.366 - type: precision_at_10 value: 4.824 - type: precision_at_100 value: 0.804 - type: precision_at_1000 value: 0.116 - type: precision_at_3 value: 11.459999999999999 - type: precision_at_5 value: 7.8740000000000006 - type: recall_at_1 value: 20.283 - type: recall_at_10 value: 41.559000000000005 - type: recall_at_100 value: 65.051 - type: recall_at_1000 value: 86.47200000000001 - type: recall_at_3 value: 30.524 - type: recall_at_5 value: 34.11 - task: type: Retrieval dataset: type: climate-fever name: MTEB ClimateFEVER config: default split: test revision: None metrics: - type: map_at_1 value: 11.326 - type: map_at_10 value: 19.357 - type: map_at_100 value: 21.014 - type: map_at_1000 value: 21.188000000000002 - type: map_at_3 value: 16.305 - type: map_at_5 value: 17.886 - type: mrr_at_1 value: 24.820999999999998 - type: mrr_at_10 value: 36.150999999999996 - type: mrr_at_100 value: 37.080999999999996 - type: mrr_at_1000 value: 37.123 - type: mrr_at_3 value: 32.952999999999996 - type: mrr_at_5 value: 34.917 - type: ndcg_at_1 value: 24.820999999999998 - type: ndcg_at_10 value: 27.131 - type: ndcg_at_100 value: 33.841 - type: ndcg_at_1000 value: 37.159 - type: ndcg_at_3 value: 22.311 - type: ndcg_at_5 value: 24.026 - type: precision_at_1 value: 24.820999999999998 - type: precision_at_10 value: 8.450000000000001 - type: precision_at_100 value: 1.557 - type: precision_at_1000 value: 0.218 - type: precision_at_3 value: 16.612 - type: precision_at_5 value: 12.808 - type: recall_at_1 value: 11.326 - type: recall_at_10 value: 32.548 - type: recall_at_100 value: 55.803000000000004 - type: recall_at_1000 value: 74.636 - type: recall_at_3 value: 20.549 - type: recall_at_5 value: 25.514 - task: type: Retrieval dataset: type: dbpedia-entity name: MTEB DBPedia config: default split: test revision: None metrics: - type: map_at_1 value: 7.481 - type: map_at_10 value: 15.043999999999999 - type: map_at_100 value: 20.194000000000003 - type: map_at_1000 value: 21.423000000000002 - type: map_at_3 value: 11.238 - type: map_at_5 value: 12.828999999999999 - type: mrr_at_1 value: 54.50000000000001 - type: mrr_at_10 value: 64.713 - type: mrr_at_100 value: 65.216 - type: mrr_at_1000 value: 65.23 - type: mrr_at_3 value: 62.74999999999999 - type: mrr_at_5 value: 63.87500000000001 - type: ndcg_at_1 value: 43.375 - type: ndcg_at_10 value: 32.631 - type: ndcg_at_100 value: 36.338 - type: ndcg_at_1000 value: 43.541000000000004 - type: ndcg_at_3 value: 36.746 - type: ndcg_at_5 value: 34.419 - type: precision_at_1 value: 54.50000000000001 - type: precision_at_10 value: 24.825 - type: precision_at_100 value: 7.698 - type: precision_at_1000 value: 1.657 - type: precision_at_3 value: 38.917 - type: precision_at_5 value: 32.35 - type: recall_at_1 value: 7.481 - type: recall_at_10 value: 20.341 - type: recall_at_100 value: 41.778 - type: recall_at_1000 value: 64.82 - type: recall_at_3 value: 12.748000000000001 - type: recall_at_5 value: 15.507000000000001 - task: type: Classification dataset: type: mteb/emotion name: MTEB EmotionClassification config: default split: test revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 metrics: - type: accuracy value: 46.580000000000005 - type: f1 value: 41.5149462395095 - task: type: Retrieval dataset: type: fever name: MTEB FEVER config: default split: test revision: None metrics: - type: map_at_1 value: 61.683 - type: map_at_10 value: 73.071 - type: map_at_100 value: 73.327 - type: map_at_1000 value: 73.341 - type: map_at_3 value: 71.446 - type: map_at_5 value: 72.557 - type: mrr_at_1 value: 66.44200000000001 - type: mrr_at_10 value: 77.725 - type: mrr_at_100 value: 77.89399999999999 - type: mrr_at_1000 value: 77.898 - type: mrr_at_3 value: 76.283 - type: mrr_at_5 value: 77.29700000000001 - type: ndcg_at_1 value: 66.44200000000001 - type: ndcg_at_10 value: 78.43 - type: ndcg_at_100 value: 79.462 - type: ndcg_at_1000 value: 79.754 - type: ndcg_at_3 value: 75.53800000000001 - type: ndcg_at_5 value: 77.332 - type: precision_at_1 value: 66.44200000000001 - type: precision_at_10 value: 9.878 - type: precision_at_100 value: 1.051 - type: precision_at_1000 value: 0.109 - type: precision_at_3 value: 29.878 - type: precision_at_5 value: 18.953 - type: recall_at_1 value: 61.683 - type: recall_at_10 value: 90.259 - type: recall_at_100 value: 94.633 - type: recall_at_1000 value: 96.60499999999999 - type: recall_at_3 value: 82.502 - type: recall_at_5 value: 86.978 - task: type: Retrieval dataset: type: fiqa name: MTEB FiQA2018 config: default split: test revision: None metrics: - type: map_at_1 value: 17.724 - type: map_at_10 value: 29.487999999999996 - type: map_at_100 value: 31.243 - type: map_at_1000 value: 31.419999999999998 - type: map_at_3 value: 25.612000000000002 - type: map_at_5 value: 27.859 - type: mrr_at_1 value: 35.802 - type: mrr_at_10 value: 44.684000000000005 - type: mrr_at_100 value: 45.578 - type: mrr_at_1000 value: 45.621 - type: mrr_at_3 value: 42.361 - type: mrr_at_5 value: 43.85 - type: ndcg_at_1 value: 35.802 - type: ndcg_at_10 value: 37.009 - type: ndcg_at_100 value: 43.903 - type: ndcg_at_1000 value: 47.019 - type: ndcg_at_3 value: 33.634 - type: ndcg_at_5 value: 34.965 - type: precision_at_1 value: 35.802 - type: precision_at_10 value: 10.386 - type: precision_at_100 value: 1.7309999999999999 - type: precision_at_1000 value: 0.231 - type: precision_at_3 value: 22.84 - type: precision_at_5 value: 17.037 - type: recall_at_1 value: 17.724 - type: recall_at_10 value: 43.708000000000006 - type: recall_at_100 value: 69.902 - type: recall_at_1000 value: 88.51 - type: recall_at_3 value: 30.740000000000002 - type: recall_at_5 value: 36.742000000000004 - task: type: Clustering dataset: type: jinaai/flores_clustering name: MTEB FloresClusteringS2S config: default split: test revision: 480b580487f53a46f881354a8348335d4edbb2de metrics: - type: v_measure value: 39.79120149869612 - task: type: Retrieval dataset: type: hotpotqa name: MTEB HotpotQA config: default split: test revision: None metrics: - type: map_at_1 value: 34.801 - type: map_at_10 value: 50.42100000000001 - type: map_at_100 value: 51.254 - type: map_at_1000 value: 51.327999999999996 - type: map_at_3 value: 47.56 - type: map_at_5 value: 49.379 - type: mrr_at_1 value: 69.602 - type: mrr_at_10 value: 76.385 - type: mrr_at_100 value: 76.668 - type: mrr_at_1000 value: 76.683 - type: mrr_at_3 value: 75.102 - type: mrr_at_5 value: 75.949 - type: ndcg_at_1 value: 69.602 - type: ndcg_at_10 value: 59.476 - type: ndcg_at_100 value: 62.527 - type: ndcg_at_1000 value: 64.043 - type: ndcg_at_3 value: 55.155 - type: ndcg_at_5 value: 57.623000000000005 - type: precision_at_1 value: 69.602 - type: precision_at_10 value: 12.292 - type: precision_at_100 value: 1.467 - type: precision_at_1000 value: 0.167 - type: precision_at_3 value: 34.634 - type: precision_at_5 value: 22.728 - type: recall_at_1 value: 34.801 - type: recall_at_10 value: 61.458 - type: recall_at_100 value: 73.363 - type: recall_at_1000 value: 83.43 - type: recall_at_3 value: 51.951 - type: recall_at_5 value: 56.82000000000001 - task: type: Classification dataset: type: mteb/imdb name: MTEB ImdbClassification config: default split: test revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 metrics: - type: accuracy value: 67.46079999999999 - type: ap value: 61.81278199159353 - type: f1 value: 67.26505019954826 - task: type: Reranking dataset: type: jinaai/miracl name: MTEB MIRACL config: default split: test revision: d28a029f35c4ff7f616df47b0edf54e6882395e6 metrics: - type: map value: 73.90464144118539 - type: mrr value: 82.44674693216022 - task: type: Retrieval dataset: type: jinaai/miracl name: MTEB MIRACLRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 21.299 - type: map_at_10 value: 70.547 - type: map_at_100 value: 72.394 - type: map_at_1000 value: 72.39999999999999 - type: map_at_3 value: 41.317 - type: map_at_5 value: 53.756 - type: mrr_at_1 value: 72.84 - type: mrr_at_10 value: 82.466 - type: mrr_at_100 value: 82.52199999999999 - type: mrr_at_1000 value: 82.52199999999999 - type: mrr_at_3 value: 80.607 - type: mrr_at_5 value: 82.065 - type: ndcg_at_1 value: 72.994 - type: ndcg_at_10 value: 80.89 - type: ndcg_at_100 value: 83.30199999999999 - type: ndcg_at_1000 value: 83.337 - type: ndcg_at_3 value: 70.357 - type: ndcg_at_5 value: 72.529 - type: precision_at_1 value: 72.994 - type: precision_at_10 value: 43.056 - type: precision_at_100 value: 4.603 - type: precision_at_1000 value: 0.461 - type: precision_at_3 value: 61.626000000000005 - type: precision_at_5 value: 55.525000000000006 - type: recall_at_1 value: 21.299 - type: recall_at_10 value: 93.903 - type: recall_at_100 value: 99.86699999999999 - type: recall_at_1000 value: 100.0 - type: recall_at_3 value: 46.653 - type: recall_at_5 value: 65.72200000000001 - task: type: Classification dataset: type: mteb/mtop_domain name: MTEB MTOPDomainClassification (en) config: en split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 90.37163702690378 - type: f1 value: 90.18615216514222 - task: type: Classification dataset: type: mteb/mtop_domain name: MTEB MTOPDomainClassification (es) config: es split: test revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf metrics: - type: accuracy value: 89.88992661774515 - type: f1 value: 89.3738963046966 - task: type: Classification dataset: type: mteb/mtop_intent name: MTEB MTOPIntentClassification (en) config: en split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 71.97218422252622 - type: f1 value: 54.03096570916335 - task: type: Classification dataset: type: mteb/mtop_intent name: MTEB MTOPIntentClassification (es) config: es split: test revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba metrics: - type: accuracy value: 68.75917278185457 - type: f1 value: 49.144083814705844 - task: type: Classification dataset: type: mteb/amazon_massive_intent name: MTEB MassiveIntentClassification (en) config: en split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 70.75991930060525 - type: f1 value: 69.37993796176502 - task: type: Classification dataset: type: mteb/amazon_massive_intent name: MTEB MassiveIntentClassification (es) config: es split: test revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 metrics: - type: accuracy value: 66.93006052454606 - type: f1 value: 66.04029135274683 - task: type: Classification dataset: type: mteb/amazon_massive_scenario name: MTEB MassiveScenarioClassification (en) config: en split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 73.81977135171486 - type: f1 value: 74.10477122507747 - task: type: Classification dataset: type: mteb/amazon_massive_scenario name: MTEB MassiveScenarioClassification (es) config: es split: test revision: 7d571f92784cd94a019292a1f45445077d0ef634 metrics: - type: accuracy value: 71.23402824478816 - type: f1 value: 71.75572665880296 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-p2p name: MTEB MedrxivClusteringP2P config: default split: test revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 metrics: - type: v_measure value: 32.189750849969215 - task: type: Clustering dataset: type: mteb/medrxiv-clustering-s2s name: MTEB MedrxivClusteringS2S config: default split: test revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 metrics: - type: v_measure value: 28.78357393555938 - task: type: Reranking dataset: type: mteb/mind_small name: MTEB MindSmallReranking config: default split: test revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 metrics: - type: map value: 30.605612998328358 - type: mrr value: 31.595529205695833 - task: type: Retrieval dataset: type: jinaai/mintakaqa name: MTEB MintakaESRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 16.213 - type: map_at_10 value: 24.079 - type: map_at_100 value: 25.039 - type: map_at_1000 value: 25.142999999999997 - type: map_at_3 value: 21.823 - type: map_at_5 value: 23.069 - type: mrr_at_1 value: 16.213 - type: mrr_at_10 value: 24.079 - type: mrr_at_100 value: 25.039 - type: mrr_at_1000 value: 25.142999999999997 - type: mrr_at_3 value: 21.823 - type: mrr_at_5 value: 23.069 - type: ndcg_at_1 value: 16.213 - type: ndcg_at_10 value: 28.315 - type: ndcg_at_100 value: 33.475 - type: ndcg_at_1000 value: 36.838 - type: ndcg_at_3 value: 23.627000000000002 - type: ndcg_at_5 value: 25.879 - type: precision_at_1 value: 16.213 - type: precision_at_10 value: 4.183 - type: precision_at_100 value: 0.6709999999999999 - type: precision_at_1000 value: 0.095 - type: precision_at_3 value: 9.612 - type: precision_at_5 value: 6.865 - type: recall_at_1 value: 16.213 - type: recall_at_10 value: 41.832 - type: recall_at_100 value: 67.12 - type: recall_at_1000 value: 94.843 - type: recall_at_3 value: 28.837000000000003 - type: recall_at_5 value: 34.323 - task: type: Retrieval dataset: type: nfcorpus name: MTEB NFCorpus config: default split: test revision: None metrics: - type: map_at_1 value: 4.692 - type: map_at_10 value: 10.783 - type: map_at_100 value: 13.447999999999999 - type: map_at_1000 value: 14.756 - type: map_at_3 value: 7.646 - type: map_at_5 value: 9.311 - type: mrr_at_1 value: 42.415000000000006 - type: mrr_at_10 value: 50.471 - type: mrr_at_100 value: 51.251999999999995 - type: mrr_at_1000 value: 51.292 - type: mrr_at_3 value: 48.4 - type: mrr_at_5 value: 49.809 - type: ndcg_at_1 value: 40.867 - type: ndcg_at_10 value: 30.303 - type: ndcg_at_100 value: 27.915 - type: ndcg_at_1000 value: 36.734 - type: ndcg_at_3 value: 35.74 - type: ndcg_at_5 value: 33.938 - type: precision_at_1 value: 42.415000000000006 - type: precision_at_10 value: 22.105 - type: precision_at_100 value: 7.173 - type: precision_at_1000 value: 2.007 - type: precision_at_3 value: 33.437 - type: precision_at_5 value: 29.349999999999998 - type: recall_at_1 value: 4.692 - type: recall_at_10 value: 14.798 - type: recall_at_100 value: 28.948 - type: recall_at_1000 value: 59.939 - type: recall_at_3 value: 8.562 - type: recall_at_5 value: 11.818 - task: type: Retrieval dataset: type: nq name: MTEB NQ config: default split: test revision: None metrics: - type: map_at_1 value: 27.572999999999997 - type: map_at_10 value: 42.754 - type: map_at_100 value: 43.8 - type: map_at_1000 value: 43.838 - type: map_at_3 value: 38.157000000000004 - type: map_at_5 value: 40.9 - type: mrr_at_1 value: 31.373 - type: mrr_at_10 value: 45.321 - type: mrr_at_100 value: 46.109 - type: mrr_at_1000 value: 46.135 - type: mrr_at_3 value: 41.483 - type: mrr_at_5 value: 43.76 - type: ndcg_at_1 value: 31.373 - type: ndcg_at_10 value: 50.7 - type: ndcg_at_100 value: 55.103 - type: ndcg_at_1000 value: 55.955999999999996 - type: ndcg_at_3 value: 42.069 - type: ndcg_at_5 value: 46.595 - type: precision_at_1 value: 31.373 - type: precision_at_10 value: 8.601 - type: precision_at_100 value: 1.11 - type: precision_at_1000 value: 0.11900000000000001 - type: precision_at_3 value: 19.399 - type: precision_at_5 value: 14.224 - type: recall_at_1 value: 27.572999999999997 - type: recall_at_10 value: 72.465 - type: recall_at_100 value: 91.474 - type: recall_at_1000 value: 97.78099999999999 - type: recall_at_3 value: 50.087 - type: recall_at_5 value: 60.516000000000005 - task: type: Retrieval dataset: type: quora name: MTEB QuoraRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 70.525 - type: map_at_10 value: 84.417 - type: map_at_100 value: 85.07000000000001 - type: map_at_1000 value: 85.085 - type: map_at_3 value: 81.45 - type: map_at_5 value: 83.317 - type: mrr_at_1 value: 81.17999999999999 - type: mrr_at_10 value: 87.34100000000001 - type: mrr_at_100 value: 87.461 - type: mrr_at_1000 value: 87.46199999999999 - type: mrr_at_3 value: 86.372 - type: mrr_at_5 value: 87.046 - type: ndcg_at_1 value: 81.17999999999999 - type: ndcg_at_10 value: 88.144 - type: ndcg_at_100 value: 89.424 - type: ndcg_at_1000 value: 89.517 - type: ndcg_at_3 value: 85.282 - type: ndcg_at_5 value: 86.874 - type: precision_at_1 value: 81.17999999999999 - type: precision_at_10 value: 13.385 - type: precision_at_100 value: 1.533 - type: precision_at_1000 value: 0.157 - type: precision_at_3 value: 37.29 - type: precision_at_5 value: 24.546 - type: recall_at_1 value: 70.525 - type: recall_at_10 value: 95.22500000000001 - type: recall_at_100 value: 99.572 - type: recall_at_1000 value: 99.98899999999999 - type: recall_at_3 value: 87.035 - type: recall_at_5 value: 91.526 - task: type: Clustering dataset: type: mteb/reddit-clustering name: MTEB RedditClustering config: default split: test revision: 24640382cdbf8abc73003fb0fa6d111a705499eb metrics: - type: v_measure value: 48.284384328108736 - task: type: Clustering dataset: type: mteb/reddit-clustering-p2p name: MTEB RedditClusteringP2P config: default split: test revision: 282350215ef01743dc01b456c7f5241fa8937f16 metrics: - type: v_measure value: 56.02508021518392 - task: type: Retrieval dataset: type: scidocs name: MTEB SCIDOCS config: default split: test revision: None metrics: - type: map_at_1 value: 4.023000000000001 - type: map_at_10 value: 10.046 - type: map_at_100 value: 11.802999999999999 - type: map_at_1000 value: 12.074 - type: map_at_3 value: 7.071 - type: map_at_5 value: 8.556 - type: mrr_at_1 value: 19.8 - type: mrr_at_10 value: 30.105999999999998 - type: mrr_at_100 value: 31.16 - type: mrr_at_1000 value: 31.224 - type: mrr_at_3 value: 26.633000000000003 - type: mrr_at_5 value: 28.768 - type: ndcg_at_1 value: 19.8 - type: ndcg_at_10 value: 17.358 - type: ndcg_at_100 value: 24.566 - type: ndcg_at_1000 value: 29.653000000000002 - type: ndcg_at_3 value: 16.052 - type: ndcg_at_5 value: 14.325 - type: precision_at_1 value: 19.8 - type: precision_at_10 value: 9.07 - type: precision_at_100 value: 1.955 - type: precision_at_1000 value: 0.318 - type: precision_at_3 value: 14.933 - type: precision_at_5 value: 12.68 - type: recall_at_1 value: 4.023000000000001 - type: recall_at_10 value: 18.398 - type: recall_at_100 value: 39.683 - type: recall_at_1000 value: 64.625 - type: recall_at_3 value: 9.113 - type: recall_at_5 value: 12.873000000000001 - task: type: STS dataset: type: mteb/sickr-sts name: MTEB SICK-R config: default split: test revision: a6ea5a8cab320b040a23452cc28066d9beae2cee metrics: - type: cos_sim_pearson value: 87.90508618312852 - type: cos_sim_spearman value: 83.01323463129205 - type: euclidean_pearson value: 84.35845059002891 - type: euclidean_spearman value: 82.85508559018527 - type: manhattan_pearson value: 84.3682368950498 - type: manhattan_spearman value: 82.8619728517302 - task: type: STS dataset: type: mteb/sts12-sts name: MTEB STS12 config: default split: test revision: a0d554a64d88156834ff5ae9920b964011b16384 metrics: - type: cos_sim_pearson value: 89.28294535873366 - type: cos_sim_spearman value: 81.61879268131732 - type: euclidean_pearson value: 85.99053604863724 - type: euclidean_spearman value: 80.95176684739084 - type: manhattan_pearson value: 85.98054086663903 - type: manhattan_spearman value: 80.9911070430335 - task: type: STS dataset: type: mteb/sts13-sts name: MTEB STS13 config: default split: test revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca metrics: - type: cos_sim_pearson value: 86.15898098455258 - type: cos_sim_spearman value: 86.8247985072307 - type: euclidean_pearson value: 86.25342429918649 - type: euclidean_spearman value: 87.13468603023252 - type: manhattan_pearson value: 86.2006134067688 - type: manhattan_spearman value: 87.06135811996896 - task: type: STS dataset: type: mteb/sts14-sts name: MTEB STS14 config: default split: test revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 metrics: - type: cos_sim_pearson value: 85.57403998481877 - type: cos_sim_spearman value: 83.55947075172618 - type: euclidean_pearson value: 84.97097562965358 - type: euclidean_spearman value: 83.6287075601467 - type: manhattan_pearson value: 84.87092197104133 - type: manhattan_spearman value: 83.53783891641335 - task: type: STS dataset: type: mteb/sts15-sts name: MTEB STS15 config: default split: test revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 metrics: - type: cos_sim_pearson value: 88.14632780204231 - type: cos_sim_spearman value: 88.74903634923868 - type: euclidean_pearson value: 88.03922995855112 - type: euclidean_spearman value: 88.72852190525855 - type: manhattan_pearson value: 87.9694791024271 - type: manhattan_spearman value: 88.66461452107418 - task: type: STS dataset: type: mteb/sts16-sts name: MTEB STS16 config: default split: test revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 metrics: - type: cos_sim_pearson value: 84.75989818558652 - type: cos_sim_spearman value: 86.03107893122942 - type: euclidean_pearson value: 85.21908960133018 - type: euclidean_spearman value: 85.93012720153482 - type: manhattan_pearson value: 85.1969170195502 - type: manhattan_spearman value: 85.8975254197784 - task: type: STS dataset: type: mteb/sts17-crosslingual-sts name: MTEB STS17 (en-en) config: en-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 89.16803898789955 - type: cos_sim_spearman value: 88.56139047950525 - type: euclidean_pearson value: 88.09685325747859 - type: euclidean_spearman value: 88.0457609458947 - type: manhattan_pearson value: 88.07054413001431 - type: manhattan_spearman value: 88.10784098889314 - task: type: STS dataset: type: mteb/sts17-crosslingual-sts name: MTEB STS17 (es-en) config: es-en split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 86.7160384474547 - type: cos_sim_spearman value: 86.4899235500562 - type: euclidean_pearson value: 85.90854477703468 - type: euclidean_spearman value: 86.16085009124498 - type: manhattan_pearson value: 85.9249735317884 - type: manhattan_spearman value: 86.25038421339116 - task: type: STS dataset: type: mteb/sts17-crosslingual-sts name: MTEB STS17 (es-es) config: es-es split: test revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d metrics: - type: cos_sim_pearson value: 89.37914622360788 - type: cos_sim_spearman value: 88.24619159322809 - type: euclidean_pearson value: 89.00538382632769 - type: euclidean_spearman value: 88.44675863524736 - type: manhattan_pearson value: 88.97372120683606 - type: manhattan_spearman value: 88.33509324222129 - task: type: STS dataset: type: mteb/sts22-crosslingual-sts name: MTEB STS22 (en) config: en split: test revision: eea2b4fe26a775864c896887d910b76a8098ad3f metrics: - type: cos_sim_pearson value: 66.22181360203069 - type: cos_sim_spearman value: 65.6218291833768 - type: euclidean_pearson value: 67.14543788822508 - type: euclidean_spearman value: 65.21269939987857 - type: manhattan_pearson value: 67.03304607195636 - type: manhattan_spearman value: 65.18885316423805 - task: type: STS dataset: type: mteb/sts22-crosslingual-sts name: MTEB STS22 (es) config: es split: test revision: eea2b4fe26a775864c896887d910b76a8098ad3f metrics: - type: cos_sim_pearson value: 65.71694059677084 - type: cos_sim_spearman value: 67.96591844540954 - type: euclidean_pearson value: 65.6964079162296 - type: euclidean_spearman value: 67.53027948900173 - type: manhattan_pearson value: 65.93545097673741 - type: manhattan_spearman value: 67.7261811805062 - task: type: STS dataset: type: mteb/sts22-crosslingual-sts name: MTEB STS22 (es-en) config: es-en split: test revision: eea2b4fe26a775864c896887d910b76a8098ad3f metrics: - type: cos_sim_pearson value: 75.43544796375058 - type: cos_sim_spearman value: 78.80462701160789 - type: euclidean_pearson value: 76.19135575163138 - type: euclidean_spearman value: 78.4974732597096 - type: manhattan_pearson value: 76.3254742699264 - type: manhattan_spearman value: 78.51884307690416 - task: type: STS dataset: type: mteb/stsbenchmark-sts name: MTEB STSBenchmark config: default split: test revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 metrics: - type: cos_sim_pearson value: 87.46805293607684 - type: cos_sim_spearman value: 87.83792784689113 - type: euclidean_pearson value: 87.3872143683234 - type: euclidean_spearman value: 87.61611384542778 - type: manhattan_pearson value: 87.38542672601992 - type: manhattan_spearman value: 87.61423971087297 - task: type: STS dataset: type: PlanTL-GOB-ES/sts-es name: MTEB STSES config: default split: test revision: 0912bb6c9393c76d62a7c5ee81c4c817ff47c9f4 metrics: - type: cos_sim_pearson value: 82.55286866116202 - type: cos_sim_spearman value: 80.22150503320272 - type: euclidean_pearson value: 83.27223445187087 - type: euclidean_spearman value: 80.59078590992925 - type: manhattan_pearson value: 83.23095887013197 - type: manhattan_spearman value: 80.87994285189795 - task: type: Reranking dataset: type: mteb/scidocs-reranking name: MTEB SciDocsRR config: default split: test revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab metrics: - type: map value: 79.29717302265792 - type: mrr value: 94.02156304117088 - task: type: Retrieval dataset: type: scifact name: MTEB SciFact config: default split: test revision: None metrics: - type: map_at_1 value: 49.9 - type: map_at_10 value: 58.626 - type: map_at_100 value: 59.519999999999996 - type: map_at_1000 value: 59.55200000000001 - type: map_at_3 value: 56.232000000000006 - type: map_at_5 value: 57.833 - type: mrr_at_1 value: 52.333 - type: mrr_at_10 value: 60.039 - type: mrr_at_100 value: 60.732 - type: mrr_at_1000 value: 60.75899999999999 - type: mrr_at_3 value: 58.278 - type: mrr_at_5 value: 59.428000000000004 - type: ndcg_at_1 value: 52.333 - type: ndcg_at_10 value: 62.67 - type: ndcg_at_100 value: 66.465 - type: ndcg_at_1000 value: 67.425 - type: ndcg_at_3 value: 58.711999999999996 - type: ndcg_at_5 value: 60.958999999999996 - type: precision_at_1 value: 52.333 - type: precision_at_10 value: 8.333 - type: precision_at_100 value: 1.027 - type: precision_at_1000 value: 0.11100000000000002 - type: precision_at_3 value: 22.778000000000002 - type: precision_at_5 value: 15.267 - type: recall_at_1 value: 49.9 - type: recall_at_10 value: 73.394 - type: recall_at_100 value: 90.43299999999999 - type: recall_at_1000 value: 98.167 - type: recall_at_3 value: 63.032999999999994 - type: recall_at_5 value: 68.444 - task: type: Clustering dataset: type: jinaai/spanish_news_clustering name: MTEB SpanishNewsClusteringP2P config: default split: test revision: b5edc3d3d7c12c7b9f883e9da50f6732f3624142 metrics: - type: v_measure value: 48.30543557796266 - task: type: Retrieval dataset: type: jinaai/spanish_passage_retrieval name: MTEB SpanishPassageRetrievalS2P config: default split: test revision: None metrics: - type: map_at_1 value: 14.443 - type: map_at_10 value: 28.736 - type: map_at_100 value: 34.514 - type: map_at_1000 value: 35.004000000000005 - type: map_at_3 value: 20.308 - type: map_at_5 value: 25.404 - type: mrr_at_1 value: 50.29900000000001 - type: mrr_at_10 value: 63.757 - type: mrr_at_100 value: 64.238 - type: mrr_at_1000 value: 64.24600000000001 - type: mrr_at_3 value: 59.480999999999995 - type: mrr_at_5 value: 62.924 - type: ndcg_at_1 value: 50.29900000000001 - type: ndcg_at_10 value: 42.126999999999995 - type: ndcg_at_100 value: 57.208000000000006 - type: ndcg_at_1000 value: 60.646 - type: ndcg_at_3 value: 38.722 - type: ndcg_at_5 value: 40.007999999999996 - type: precision_at_1 value: 50.29900000000001 - type: precision_at_10 value: 19.82 - type: precision_at_100 value: 4.82 - type: precision_at_1000 value: 0.5910000000000001 - type: precision_at_3 value: 31.537 - type: precision_at_5 value: 28.262999999999998 - type: recall_at_1 value: 14.443 - type: recall_at_10 value: 43.885999999999996 - type: recall_at_100 value: 85.231 - type: recall_at_1000 value: 99.07000000000001 - type: recall_at_3 value: 22.486 - type: recall_at_5 value: 33.035 - task: type: Retrieval dataset: type: jinaai/spanish_passage_retrieval name: MTEB SpanishPassageRetrievalS2S config: default split: test revision: None metrics: - type: map_at_1 value: 15.578 - type: map_at_10 value: 52.214000000000006 - type: map_at_100 value: 64.791 - type: map_at_1000 value: 64.791 - type: map_at_3 value: 33.396 - type: map_at_5 value: 41.728 - type: mrr_at_1 value: 73.653 - type: mrr_at_10 value: 85.116 - type: mrr_at_100 value: 85.205 - type: mrr_at_1000 value: 85.205 - type: mrr_at_3 value: 84.631 - type: mrr_at_5 value: 85.05 - type: ndcg_at_1 value: 76.64699999999999 - type: ndcg_at_10 value: 70.38600000000001 - type: ndcg_at_100 value: 82.27600000000001 - type: ndcg_at_1000 value: 82.27600000000001 - type: ndcg_at_3 value: 70.422 - type: ndcg_at_5 value: 69.545 - type: precision_at_1 value: 76.64699999999999 - type: precision_at_10 value: 43.653 - type: precision_at_100 value: 7.718999999999999 - type: precision_at_1000 value: 0.772 - type: precision_at_3 value: 64.671 - type: precision_at_5 value: 56.766000000000005 - type: recall_at_1 value: 15.578 - type: recall_at_10 value: 67.459 - type: recall_at_100 value: 100.0 - type: recall_at_1000 value: 100.0 - type: recall_at_3 value: 36.922 - type: recall_at_5 value: 49.424 - task: type: PairClassification dataset: type: mteb/sprintduplicatequestions-pairclassification name: MTEB SprintDuplicateQuestions config: default split: test revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 metrics: - type: cos_sim_accuracy value: 99.81683168316832 - type: cos_sim_ap value: 95.61502659412484 - type: cos_sim_f1 value: 90.6813627254509 - type: cos_sim_precision value: 90.86345381526104 - type: cos_sim_recall value: 90.5 - type: dot_accuracy value: 99.8039603960396 - type: dot_ap value: 95.36783483182609 - type: dot_f1 value: 89.90825688073394 - type: dot_precision value: 91.68399168399168 - type: dot_recall value: 88.2 - type: euclidean_accuracy value: 99.81188118811882 - type: euclidean_ap value: 95.51583052324564 - type: euclidean_f1 value: 90.46214355948868 - type: euclidean_precision value: 88.97485493230174 - type: euclidean_recall value: 92.0 - type: manhattan_accuracy value: 99.8079207920792 - type: manhattan_ap value: 95.44030644653718 - type: manhattan_f1 value: 90.37698412698413 - type: manhattan_precision value: 89.66535433070865 - type: manhattan_recall value: 91.10000000000001 - type: max_accuracy value: 99.81683168316832 - type: max_ap value: 95.61502659412484 - type: max_f1 value: 90.6813627254509 - task: type: Clustering dataset: type: mteb/stackexchange-clustering name: MTEB StackExchangeClustering config: default split: test revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 metrics: - type: v_measure value: 55.39046705023096 - task: type: Clustering dataset: type: mteb/stackexchange-clustering-p2p name: MTEB StackExchangeClusteringP2P config: default split: test revision: 815ca46b2622cec33ccafc3735d572c266efdb44 metrics: - type: v_measure value: 33.57429225651293 - task: type: Reranking dataset: type: mteb/stackoverflowdupquestions-reranking name: MTEB StackOverflowDupQuestions config: default split: test revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 metrics: - type: map value: 50.17622570658746 - type: mrr value: 50.99844293778118 - task: type: Summarization dataset: type: mteb/summeval name: MTEB SummEval config: default split: test revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c metrics: - type: cos_sim_pearson value: 29.97416289382191 - type: cos_sim_spearman value: 29.871890597161432 - type: dot_pearson value: 28.768845892613644 - type: dot_spearman value: 28.872458999448686 - task: type: Retrieval dataset: type: trec-covid name: MTEB TRECCOVID config: default split: test revision: None metrics: - type: map_at_1 value: 0.22599999999999998 - type: map_at_10 value: 1.646 - type: map_at_100 value: 9.491 - type: map_at_1000 value: 23.75 - type: map_at_3 value: 0.588 - type: map_at_5 value: 0.9129999999999999 - type: mrr_at_1 value: 84.0 - type: mrr_at_10 value: 89.889 - type: mrr_at_100 value: 89.889 - type: mrr_at_1000 value: 89.889 - type: mrr_at_3 value: 89.667 - type: mrr_at_5 value: 89.667 - type: ndcg_at_1 value: 75.0 - type: ndcg_at_10 value: 67.368 - type: ndcg_at_100 value: 52.834 - type: ndcg_at_1000 value: 49.144 - type: ndcg_at_3 value: 72.866 - type: ndcg_at_5 value: 70.16 - type: precision_at_1 value: 84.0 - type: precision_at_10 value: 71.8 - type: precision_at_100 value: 54.04 - type: precision_at_1000 value: 21.709999999999997 - type: precision_at_3 value: 77.333 - type: precision_at_5 value: 74.0 - type: recall_at_1 value: 0.22599999999999998 - type: recall_at_10 value: 1.9029999999999998 - type: recall_at_100 value: 13.012 - type: recall_at_1000 value: 46.105000000000004 - type: recall_at_3 value: 0.63 - type: recall_at_5 value: 1.0030000000000001 - task: type: Retrieval dataset: type: webis-touche2020 name: MTEB Touche2020 config: default split: test revision: None metrics: - type: map_at_1 value: 1.5 - type: map_at_10 value: 8.193999999999999 - type: map_at_100 value: 14.01 - type: map_at_1000 value: 15.570999999999998 - type: map_at_3 value: 4.361000000000001 - type: map_at_5 value: 5.9270000000000005 - type: mrr_at_1 value: 16.326999999999998 - type: mrr_at_10 value: 33.326 - type: mrr_at_100 value: 34.592 - type: mrr_at_1000 value: 34.592 - type: mrr_at_3 value: 29.252 - type: mrr_at_5 value: 30.680000000000003 - type: ndcg_at_1 value: 15.306000000000001 - type: ndcg_at_10 value: 19.819 - type: ndcg_at_100 value: 33.428000000000004 - type: ndcg_at_1000 value: 45.024 - type: ndcg_at_3 value: 19.667 - type: ndcg_at_5 value: 19.625 - type: precision_at_1 value: 16.326999999999998 - type: precision_at_10 value: 18.367 - type: precision_at_100 value: 7.367 - type: precision_at_1000 value: 1.496 - type: precision_at_3 value: 23.128999999999998 - type: precision_at_5 value: 21.633 - type: recall_at_1 value: 1.5 - type: recall_at_10 value: 14.362 - type: recall_at_100 value: 45.842 - type: recall_at_1000 value: 80.42 - type: recall_at_3 value: 5.99 - type: recall_at_5 value: 8.701 - task: type: Classification dataset: type: mteb/toxic_conversations_50k name: MTEB ToxicConversationsClassification config: default split: test revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c metrics: - type: accuracy value: 70.04740000000001 - type: ap value: 13.58661943759992 - type: f1 value: 53.727487131754195 - task: type: Classification dataset: type: mteb/tweet_sentiment_extraction name: MTEB TweetSentimentExtractionClassification config: default split: test revision: d604517c81ca91fe16a244d1248fc021f9ecee7a metrics: - type: accuracy value: 61.06395019807584 - type: f1 value: 61.36753664680866 - task: type: Clustering dataset: type: mteb/twentynewsgroups-clustering name: MTEB TwentyNewsgroupsClustering config: default split: test revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 metrics: - type: v_measure value: 40.19881263066229 - task: type: PairClassification dataset: type: mteb/twittersemeval2015-pairclassification name: MTEB TwitterSemEval2015 config: default split: test revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 metrics: - type: cos_sim_accuracy value: 85.19401561661799 - type: cos_sim_ap value: 71.62462506173092 - type: cos_sim_f1 value: 66.0641327225455 - type: cos_sim_precision value: 62.234662934453 - type: cos_sim_recall value: 70.3957783641161 - type: dot_accuracy value: 84.69333015437802 - type: dot_ap value: 69.83805526490895 - type: dot_f1 value: 64.85446235265817 - type: dot_precision value: 59.59328028293546 - type: dot_recall value: 71.13456464379946 - type: euclidean_accuracy value: 85.38475293556655 - type: euclidean_ap value: 72.05594596250286 - type: euclidean_f1 value: 66.53543307086615 - type: euclidean_precision value: 62.332872291378514 - type: euclidean_recall value: 71.34564643799473 - type: manhattan_accuracy value: 85.3907134767837 - type: manhattan_ap value: 72.04585410650152 - type: manhattan_f1 value: 66.57132642116554 - type: manhattan_precision value: 60.704194740273856 - type: manhattan_recall value: 73.6939313984169 - type: max_accuracy value: 85.3907134767837 - type: max_ap value: 72.05594596250286 - type: max_f1 value: 66.57132642116554 - task: type: PairClassification dataset: type: mteb/twitterurlcorpus-pairclassification name: MTEB TwitterURLCorpus config: default split: test revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf metrics: - type: cos_sim_accuracy value: 89.30414871735165 - type: cos_sim_ap value: 86.4398673359918 - type: cos_sim_f1 value: 78.9243598692186 - type: cos_sim_precision value: 75.47249350101876 - type: cos_sim_recall value: 82.7071142593163 - type: dot_accuracy value: 89.26145845461248 - type: dot_ap value: 86.32172118414802 - type: dot_f1 value: 78.8277467755645 - type: dot_precision value: 75.79418662497335 - type: dot_recall value: 82.11425931629196 - type: euclidean_accuracy value: 89.24205378973105 - type: euclidean_ap value: 86.23988673522649 - type: euclidean_f1 value: 78.67984857951413 - type: euclidean_precision value: 75.2689684269742 - type: euclidean_recall value: 82.41453649522637 - type: manhattan_accuracy value: 89.18189932859859 - type: manhattan_ap value: 86.21003833972824 - type: manhattan_f1 value: 78.70972564850115 - type: manhattan_precision value: 76.485544094145 - type: manhattan_recall value: 81.0671388974438 - type: max_accuracy value: 89.30414871735165 - type: max_ap value: 86.4398673359918 - type: max_f1 value: 78.9243598692186 - task: type: Clustering dataset: type: jinaai/cities_wiki_clustering name: MTEB WikiCitiesClustering config: default split: test revision: ddc9ee9242fa65332597f70e967ecc38b9d734fa metrics: - type: v_measure value: 73.254610626148 - task: type: Retrieval dataset: type: jinaai/xmarket_ml name: MTEB XMarketES config: default split: test revision: 705db869e8107dfe6e34b832af90446e77d813e3 metrics: - type: map_at_1 value: 5.506 - type: map_at_10 value: 11.546 - type: map_at_100 value: 14.299999999999999 - type: map_at_1000 value: 15.146999999999998 - type: map_at_3 value: 8.748000000000001 - type: map_at_5 value: 10.036000000000001 - type: mrr_at_1 value: 17.902 - type: mrr_at_10 value: 25.698999999999998 - type: mrr_at_100 value: 26.634 - type: mrr_at_1000 value: 26.704 - type: mrr_at_3 value: 23.244999999999997 - type: mrr_at_5 value: 24.555 - type: ndcg_at_1 value: 17.902 - type: ndcg_at_10 value: 19.714000000000002 - type: ndcg_at_100 value: 25.363000000000003 - type: ndcg_at_1000 value: 30.903999999999996 - type: ndcg_at_3 value: 17.884 - type: ndcg_at_5 value: 18.462 - type: precision_at_1 value: 17.902 - type: precision_at_10 value: 10.467 - type: precision_at_100 value: 3.9699999999999998 - type: precision_at_1000 value: 1.1320000000000001 - type: precision_at_3 value: 14.387 - type: precision_at_5 value: 12.727 - type: recall_at_1 value: 5.506 - type: recall_at_10 value: 19.997999999999998 - type: recall_at_100 value: 42.947 - type: recall_at_1000 value: 67.333 - type: recall_at_3 value: 11.158 - type: recall_at_5 value: 14.577000000000002 - task: type: Retrieval dataset: type: jinaai/xpqa name: MTEB XPQAESRetrieval config: default split: test revision: None metrics: - type: map_at_1 value: 32.53 - type: map_at_10 value: 58.68600000000001 - type: map_at_100 value: 60.45399999999999 - type: map_at_1000 value: 60.51499999999999 - type: map_at_3 value: 50.356 - type: map_at_5 value: 55.98 - type: mrr_at_1 value: 61.791 - type: mrr_at_10 value: 68.952 - type: mrr_at_100 value: 69.524 - type: mrr_at_1000 value: 69.538 - type: mrr_at_3 value: 67.087 - type: mrr_at_5 value: 68.052 - type: ndcg_at_1 value: 61.791 - type: ndcg_at_10 value: 65.359 - type: ndcg_at_100 value: 70.95700000000001 - type: ndcg_at_1000 value: 71.881 - type: ndcg_at_3 value: 59.999 - type: ndcg_at_5 value: 61.316 - type: precision_at_1 value: 61.791 - type: precision_at_10 value: 18.184 - type: precision_at_100 value: 2.317 - type: precision_at_1000 value: 0.245 - type: precision_at_3 value: 42.203 - type: precision_at_5 value: 31.374999999999996 - type: recall_at_1 value: 32.53 - type: recall_at_10 value: 73.098 - type: recall_at_100 value: 94.029 - type: recall_at_1000 value: 99.842 - type: recall_at_3 value: 54.525 - type: recall_at_5 value: 63.796 ---

Finetuner logo: Finetuner helps you to create experiments in order to improve embeddings on search tasks. It accompanies you to deliver the last mile of performance-tuning for neural search applications.

The text embedding set trained by Jina AI.

## Quick Start The easiest way to starting using `jina-embeddings-v2-base-es` is to use Jina AI's [Embedding API](https://jina.ai/embeddings/). ## Intended Usage & Model Info `jina-embeddings-v2-base-es` is a Spanish/English bilingual text **embedding model** supporting **8192 sequence length**. It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length. We have designed it for high performance in mono-lingual & cross-lingual applications and trained it specifically to support mixed Spanish-English input without bias. Additionally, we provide the following embedding models: `jina-embeddings-v2-base-es` es un modelo (embedding) de texto bilingüe Inglés/Español que admite una longitud de secuencia de 8192. Se basa en la arquitectura BERT (JinaBERT) que incorpora la variante bi-direccional simétrica de [ALiBi](https://arxiv.org/abs/2108.12409) para permitir una mayor longitud de secuencia. Hemos diseñado este modelo para un alto rendimiento en aplicaciones monolingües y bilingües, y está entrenando específicamente para admitir entradas mixtas de español e inglés sin sesgo. Adicionalmente, proporcionamos los siguientes modelos (embeddings): - [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en): 33 million parameters. - [`jina-embeddings-v2-base-en`](https://huggingface.co/jinaai/jina-embeddings-v2-base-en): 137 million parameters. - [`jina-embeddings-v2-base-zh`](https://huggingface.co/jinaai/jina-embeddings-v2-base-zh): Chinese-English Bilingual embeddings. - [`jina-embeddings-v2-base-de`](https://huggingface.co/jinaai/jina-embeddings-v2-base-de): German-English Bilingual embeddings. - [`jina-embeddings-v2-base-es`](): Spanish-English Bilingual embeddings **(you are here)**. ## Data & Parameters The data and training details are described in this [technical report](https://arxiv.org/abs/2402.17016) ## Usage **
Please apply mean pooling when integrating the model.**

### Why mean pooling? `mean pooling` takes all token embeddings from model output and averaging them at sentence/paragraph level. It has been proved to be the most effective way to produce high-quality sentence embeddings. We offer an `encode` function to deal with this. However, if you would like to do it without using the default `encode` function: ```python import torch import torch.nn.functional as F from transformers import AutoTokenizer, AutoModel def mean_pooling(model_output, attention_mask): token_embeddings = model_output[0] input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float() return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9) sentences = ['How is the weather today?', 'What is the current weather like today?'] tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v2-base-es') model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-es', trust_remote_code=True) encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt') with torch.no_grad(): model_output = model(**encoded_input) embeddings = mean_pooling(model_output, encoded_input['attention_mask']) embeddings = F.normalize(embeddings, p=2, dim=1) ```

You can use Jina Embedding models directly from the `transformers` package: ```python !pip install transformers from transformers import AutoModel from numpy.linalg import norm cos_sim = lambda a,b: (a @ b.T) / (norm(a)*norm(b)) model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-es', trust_remote_code=True) # trust_remote_code is needed to use the encode method embeddings = model.encode(['How is the weather today?', '¿Qué tiempo hace hoy?']) print(cos_sim(embeddings[0], embeddings[1])) ``` If you only want to handle shorter sequence, such as 2k, pass the `max_length` parameter to the `encode` function: ```python embeddings = model.encode( ['Very long ... document'], max_length=2048 ) ``` Or you can use the model with the `sentence-transformers` package: ```python from sentence_transformers import SentenceTransformer, util model = SentenceTransformer("jinaai/jina-embeddings-v2-base-es", trust_remote_code=True) embeddings = model.encode(['How is the weather today?', '¿Qué tiempo hace hoy?']) print(util.cos_sim(embeddings[0], embeddings[1])) ``` And if you only want to handle shorter sequence, such as 2k, then you can set the `model.max_seq_length` ```python model.max_seq_length = 2048 ``` ## Alternatives to Transformers and Sentence Transformers 1. _Managed SaaS_: Get started with a free key on Jina AI's [Embedding API](https://jina.ai/embeddings/). 2. _Private and high-performance deployment_: Get started by picking from our suite of models and deploy them on [AWS Sagemaker](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy). ## Use Jina Embeddings for RAG According to the latest blog post from [LLamaIndex](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83), > In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out. ## Plans 1. Bilingual embedding models supporting more European & Asian languages, including French, Italian and Japanese. 2. Multimodal embedding models enable Multimodal RAG applications. 3. High-performt rerankers. ## Contact Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas. ## Citation If you find Jina Embeddings useful in your research, please cite the following paper: ``` @article{mohr2024multi, title={Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings}, author={Mohr, Isabelle and Krimmel, Markus and Sturua, Saba and Akram, Mohammad Kalim and Koukounas, Andreas and G{\"u}nther, Michael and Mastrapas, Georgios and Ravishankar, Vinit and Mart{\'\i}nez, Joan Fontanals and Wang, Feng and others}, journal={arXiv preprint arXiv:2402.17016}, year={2024} } ```