|
--- |
|
language: |
|
- en |
|
library_name: sentence-transformers |
|
license: mit |
|
pipeline_tag: sentence-similarity |
|
tags: |
|
- feature-extraction |
|
- mteb |
|
- sentence-similarity |
|
- sentence-transformers |
|
|
|
model-index: |
|
- name: GIST-all-MiniLM-L6-v2 |
|
results: |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/amazon_counterfactual |
|
name: MTEB AmazonCounterfactualClassification (en) |
|
config: en |
|
split: test |
|
revision: e8379541af4e31359cca9fbcf4b00f2671dba205 |
|
metrics: |
|
- type: accuracy |
|
value: 69.68656716417911 |
|
- type: ap |
|
value: 31.84640905923114 |
|
- type: f1 |
|
value: 63.4379647836158 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/amazon_polarity |
|
name: MTEB AmazonPolarityClassification |
|
config: default |
|
split: test |
|
revision: e2d317d38cd51312af73b3d32a06d1a08b442046 |
|
metrics: |
|
- type: accuracy |
|
value: 82.078025 |
|
- type: ap |
|
value: 77.3451894150185 |
|
- type: f1 |
|
value: 81.97258648080654 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/amazon_reviews_multi |
|
name: MTEB AmazonReviewsClassification (en) |
|
config: en |
|
split: test |
|
revision: 1399c76144fd37290681b995c656ef9b2e06e26d |
|
metrics: |
|
- type: accuracy |
|
value: 38.254 |
|
- type: f1 |
|
value: 37.940387801030376 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: arguana |
|
name: MTEB ArguAna |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 28.876 |
|
- type: map_at_10 |
|
value: 44.741 |
|
- type: map_at_100 |
|
value: 45.688 |
|
- type: map_at_1000 |
|
value: 45.695 |
|
- type: map_at_3 |
|
value: 39.829 |
|
- type: map_at_5 |
|
value: 42.646 |
|
- type: mrr_at_1 |
|
value: 30.156 |
|
- type: mrr_at_10 |
|
value: 45.196 |
|
- type: mrr_at_100 |
|
value: 46.149 |
|
- type: mrr_at_1000 |
|
value: 46.156000000000006 |
|
- type: mrr_at_3 |
|
value: 40.339000000000006 |
|
- type: mrr_at_5 |
|
value: 43.120000000000005 |
|
- type: ndcg_at_1 |
|
value: 28.876 |
|
- type: ndcg_at_10 |
|
value: 53.581 |
|
- type: ndcg_at_100 |
|
value: 57.428000000000004 |
|
- type: ndcg_at_1000 |
|
value: 57.599000000000004 |
|
- type: ndcg_at_3 |
|
value: 43.46 |
|
- type: ndcg_at_5 |
|
value: 48.501 |
|
- type: precision_at_1 |
|
value: 28.876 |
|
- type: precision_at_10 |
|
value: 8.186 |
|
- type: precision_at_100 |
|
value: 0.9820000000000001 |
|
- type: precision_at_1000 |
|
value: 0.1 |
|
- type: precision_at_3 |
|
value: 17.994 |
|
- type: precision_at_5 |
|
value: 13.229 |
|
- type: recall_at_1 |
|
value: 28.876 |
|
- type: recall_at_10 |
|
value: 81.863 |
|
- type: recall_at_100 |
|
value: 98.222 |
|
- type: recall_at_1000 |
|
value: 99.502 |
|
- type: recall_at_3 |
|
value: 53.983000000000004 |
|
- type: recall_at_5 |
|
value: 66.145 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/arxiv-clustering-p2p |
|
name: MTEB ArxivClusteringP2P |
|
config: default |
|
split: test |
|
revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d |
|
metrics: |
|
- type: v_measure |
|
value: 44.81109445338116 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/arxiv-clustering-s2s |
|
name: MTEB ArxivClusteringS2S |
|
config: default |
|
split: test |
|
revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53 |
|
metrics: |
|
- type: v_measure |
|
value: 35.705350248894476 |
|
- task: |
|
type: Reranking |
|
dataset: |
|
type: mteb/askubuntudupquestions-reranking |
|
name: MTEB AskUbuntuDupQuestions |
|
config: default |
|
split: test |
|
revision: 2000358ca161889fa9c082cb41daa8dcfb161a54 |
|
metrics: |
|
- type: map |
|
value: 63.13335364248881 |
|
- type: mrr |
|
value: 76.80605021325243 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/biosses-sts |
|
name: MTEB BIOSSES |
|
config: default |
|
split: test |
|
revision: d3fb88f8f02e40887cd149695127462bbcf29b4a |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 83.33741812376516 |
|
- type: cos_sim_spearman |
|
value: 80.51267790947811 |
|
- type: euclidean_pearson |
|
value: 67.49002803470997 |
|
- type: euclidean_spearman |
|
value: 65.39064659674824 |
|
- type: manhattan_pearson |
|
value: 67.3390206944745 |
|
- type: manhattan_spearman |
|
value: 65.35329634810715 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/banking77 |
|
name: MTEB Banking77Classification |
|
config: default |
|
split: test |
|
revision: 0fd18e25b25c072e09e0d92ab615fda904d66300 |
|
metrics: |
|
- type: accuracy |
|
value: 83.13636363636364 |
|
- type: f1 |
|
value: 83.10810612376775 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/biorxiv-clustering-p2p |
|
name: MTEB BiorxivClusteringP2P |
|
config: default |
|
split: test |
|
revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40 |
|
metrics: |
|
- type: v_measure |
|
value: 38.47849860204599 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/biorxiv-clustering-s2s |
|
name: MTEB BiorxivClusteringS2S |
|
config: default |
|
split: test |
|
revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908 |
|
metrics: |
|
- type: v_measure |
|
value: 31.159196233892057 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackAndroidRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 34.096 |
|
- type: map_at_10 |
|
value: 46.61 |
|
- type: map_at_100 |
|
value: 48.163 |
|
- type: map_at_1000 |
|
value: 48.272 |
|
- type: map_at_3 |
|
value: 43.03 |
|
- type: map_at_5 |
|
value: 45.036 |
|
- type: mrr_at_1 |
|
value: 42.489 |
|
- type: mrr_at_10 |
|
value: 52.83 |
|
- type: mrr_at_100 |
|
value: 53.525 |
|
- type: mrr_at_1000 |
|
value: 53.561 |
|
- type: mrr_at_3 |
|
value: 50.453 |
|
- type: mrr_at_5 |
|
value: 51.991 |
|
- type: ndcg_at_1 |
|
value: 42.489 |
|
- type: ndcg_at_10 |
|
value: 53.21900000000001 |
|
- type: ndcg_at_100 |
|
value: 58.277 |
|
- type: ndcg_at_1000 |
|
value: 59.836999999999996 |
|
- type: ndcg_at_3 |
|
value: 48.64 |
|
- type: ndcg_at_5 |
|
value: 50.800999999999995 |
|
- type: precision_at_1 |
|
value: 42.489 |
|
- type: precision_at_10 |
|
value: 10.343 |
|
- type: precision_at_100 |
|
value: 1.624 |
|
- type: precision_at_1000 |
|
value: 0.20400000000000001 |
|
- type: precision_at_3 |
|
value: 23.605 |
|
- type: precision_at_5 |
|
value: 16.881 |
|
- type: recall_at_1 |
|
value: 34.096 |
|
- type: recall_at_10 |
|
value: 65.003 |
|
- type: recall_at_100 |
|
value: 86.211 |
|
- type: recall_at_1000 |
|
value: 96.017 |
|
- type: recall_at_3 |
|
value: 51.307 |
|
- type: recall_at_5 |
|
value: 57.873 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackEnglishRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 29.482000000000003 |
|
- type: map_at_10 |
|
value: 39.793 |
|
- type: map_at_100 |
|
value: 41.028 |
|
- type: map_at_1000 |
|
value: 41.163 |
|
- type: map_at_3 |
|
value: 36.674 |
|
- type: map_at_5 |
|
value: 38.379999999999995 |
|
- type: mrr_at_1 |
|
value: 37.197 |
|
- type: mrr_at_10 |
|
value: 45.991 |
|
- type: mrr_at_100 |
|
value: 46.599000000000004 |
|
- type: mrr_at_1000 |
|
value: 46.649 |
|
- type: mrr_at_3 |
|
value: 43.662 |
|
- type: mrr_at_5 |
|
value: 45.054 |
|
- type: ndcg_at_1 |
|
value: 37.197 |
|
- type: ndcg_at_10 |
|
value: 45.73 |
|
- type: ndcg_at_100 |
|
value: 50.074 |
|
- type: ndcg_at_1000 |
|
value: 52.312000000000005 |
|
- type: ndcg_at_3 |
|
value: 41.308 |
|
- type: ndcg_at_5 |
|
value: 43.323 |
|
- type: precision_at_1 |
|
value: 37.197 |
|
- type: precision_at_10 |
|
value: 8.854 |
|
- type: precision_at_100 |
|
value: 1.411 |
|
- type: precision_at_1000 |
|
value: 0.191 |
|
- type: precision_at_3 |
|
value: 20.085 |
|
- type: precision_at_5 |
|
value: 14.42 |
|
- type: recall_at_1 |
|
value: 29.482000000000003 |
|
- type: recall_at_10 |
|
value: 56.077999999999996 |
|
- type: recall_at_100 |
|
value: 74.83800000000001 |
|
- type: recall_at_1000 |
|
value: 89.128 |
|
- type: recall_at_3 |
|
value: 42.971 |
|
- type: recall_at_5 |
|
value: 48.577 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackGamingRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 38.679 |
|
- type: map_at_10 |
|
value: 50.854 |
|
- type: map_at_100 |
|
value: 51.849000000000004 |
|
- type: map_at_1000 |
|
value: 51.909000000000006 |
|
- type: map_at_3 |
|
value: 47.82 |
|
- type: map_at_5 |
|
value: 49.479 |
|
- type: mrr_at_1 |
|
value: 44.263000000000005 |
|
- type: mrr_at_10 |
|
value: 54.161 |
|
- type: mrr_at_100 |
|
value: 54.833 |
|
- type: mrr_at_1000 |
|
value: 54.86600000000001 |
|
- type: mrr_at_3 |
|
value: 51.912000000000006 |
|
- type: mrr_at_5 |
|
value: 53.201 |
|
- type: ndcg_at_1 |
|
value: 44.263000000000005 |
|
- type: ndcg_at_10 |
|
value: 56.486000000000004 |
|
- type: ndcg_at_100 |
|
value: 60.553999999999995 |
|
- type: ndcg_at_1000 |
|
value: 61.77 |
|
- type: ndcg_at_3 |
|
value: 51.456999999999994 |
|
- type: ndcg_at_5 |
|
value: 53.83 |
|
- type: precision_at_1 |
|
value: 44.263000000000005 |
|
- type: precision_at_10 |
|
value: 9.041 |
|
- type: precision_at_100 |
|
value: 1.204 |
|
- type: precision_at_1000 |
|
value: 0.135 |
|
- type: precision_at_3 |
|
value: 22.989 |
|
- type: precision_at_5 |
|
value: 15.598999999999998 |
|
- type: recall_at_1 |
|
value: 38.679 |
|
- type: recall_at_10 |
|
value: 69.77799999999999 |
|
- type: recall_at_100 |
|
value: 87.59 |
|
- type: recall_at_1000 |
|
value: 96.202 |
|
- type: recall_at_3 |
|
value: 56.351 |
|
- type: recall_at_5 |
|
value: 62.16199999999999 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackGisRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 27.245 |
|
- type: map_at_10 |
|
value: 36.104 |
|
- type: map_at_100 |
|
value: 37.207 |
|
- type: map_at_1000 |
|
value: 37.288 |
|
- type: map_at_3 |
|
value: 33.427 |
|
- type: map_at_5 |
|
value: 34.866 |
|
- type: mrr_at_1 |
|
value: 29.604999999999997 |
|
- type: mrr_at_10 |
|
value: 38.346999999999994 |
|
- type: mrr_at_100 |
|
value: 39.274 |
|
- type: mrr_at_1000 |
|
value: 39.336 |
|
- type: mrr_at_3 |
|
value: 35.876000000000005 |
|
- type: mrr_at_5 |
|
value: 37.164 |
|
- type: ndcg_at_1 |
|
value: 29.604999999999997 |
|
- type: ndcg_at_10 |
|
value: 41.253 |
|
- type: ndcg_at_100 |
|
value: 46.511 |
|
- type: ndcg_at_1000 |
|
value: 48.503 |
|
- type: ndcg_at_3 |
|
value: 35.975 |
|
- type: ndcg_at_5 |
|
value: 38.35 |
|
- type: precision_at_1 |
|
value: 29.604999999999997 |
|
- type: precision_at_10 |
|
value: 6.305 |
|
- type: precision_at_100 |
|
value: 0.9440000000000001 |
|
- type: precision_at_1000 |
|
value: 0.11499999999999999 |
|
- type: precision_at_3 |
|
value: 15.179 |
|
- type: precision_at_5 |
|
value: 10.508000000000001 |
|
- type: recall_at_1 |
|
value: 27.245 |
|
- type: recall_at_10 |
|
value: 55.07300000000001 |
|
- type: recall_at_100 |
|
value: 79.036 |
|
- type: recall_at_1000 |
|
value: 93.809 |
|
- type: recall_at_3 |
|
value: 40.593 |
|
- type: recall_at_5 |
|
value: 46.318 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackMathematicaRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 15.440000000000001 |
|
- type: map_at_10 |
|
value: 23.758000000000003 |
|
- type: map_at_100 |
|
value: 25.1 |
|
- type: map_at_1000 |
|
value: 25.230000000000004 |
|
- type: map_at_3 |
|
value: 21.093 |
|
- type: map_at_5 |
|
value: 22.431 |
|
- type: mrr_at_1 |
|
value: 19.279 |
|
- type: mrr_at_10 |
|
value: 28.077 |
|
- type: mrr_at_100 |
|
value: 29.164 |
|
- type: mrr_at_1000 |
|
value: 29.237000000000002 |
|
- type: mrr_at_3 |
|
value: 25.497999999999998 |
|
- type: mrr_at_5 |
|
value: 26.76 |
|
- type: ndcg_at_1 |
|
value: 19.279 |
|
- type: ndcg_at_10 |
|
value: 29.025000000000002 |
|
- type: ndcg_at_100 |
|
value: 35.244 |
|
- type: ndcg_at_1000 |
|
value: 38.112 |
|
- type: ndcg_at_3 |
|
value: 24.079 |
|
- type: ndcg_at_5 |
|
value: 26.064999999999998 |
|
- type: precision_at_1 |
|
value: 19.279 |
|
- type: precision_at_10 |
|
value: 5.498 |
|
- type: precision_at_100 |
|
value: 0.985 |
|
- type: precision_at_1000 |
|
value: 0.136 |
|
- type: precision_at_3 |
|
value: 11.692 |
|
- type: precision_at_5 |
|
value: 8.383000000000001 |
|
- type: recall_at_1 |
|
value: 15.440000000000001 |
|
- type: recall_at_10 |
|
value: 40.855999999999995 |
|
- type: recall_at_100 |
|
value: 67.916 |
|
- type: recall_at_1000 |
|
value: 88.11 |
|
- type: recall_at_3 |
|
value: 27.387 |
|
- type: recall_at_5 |
|
value: 32.387 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackPhysicsRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 29.351 |
|
- type: map_at_10 |
|
value: 40.477999999999994 |
|
- type: map_at_100 |
|
value: 41.8 |
|
- type: map_at_1000 |
|
value: 41.926 |
|
- type: map_at_3 |
|
value: 37.246 |
|
- type: map_at_5 |
|
value: 39.206 |
|
- type: mrr_at_1 |
|
value: 36.092 |
|
- type: mrr_at_10 |
|
value: 46.319 |
|
- type: mrr_at_100 |
|
value: 47.087 |
|
- type: mrr_at_1000 |
|
value: 47.13 |
|
- type: mrr_at_3 |
|
value: 43.808 |
|
- type: mrr_at_5 |
|
value: 45.406 |
|
- type: ndcg_at_1 |
|
value: 36.092 |
|
- type: ndcg_at_10 |
|
value: 46.707 |
|
- type: ndcg_at_100 |
|
value: 52.266 |
|
- type: ndcg_at_1000 |
|
value: 54.303000000000004 |
|
- type: ndcg_at_3 |
|
value: 41.858000000000004 |
|
- type: ndcg_at_5 |
|
value: 44.407999999999994 |
|
- type: precision_at_1 |
|
value: 36.092 |
|
- type: precision_at_10 |
|
value: 8.527 |
|
- type: precision_at_100 |
|
value: 1.34 |
|
- type: precision_at_1000 |
|
value: 0.172 |
|
- type: precision_at_3 |
|
value: 20.212 |
|
- type: precision_at_5 |
|
value: 14.456 |
|
- type: recall_at_1 |
|
value: 29.351 |
|
- type: recall_at_10 |
|
value: 59.254 |
|
- type: recall_at_100 |
|
value: 83.047 |
|
- type: recall_at_1000 |
|
value: 95.911 |
|
- type: recall_at_3 |
|
value: 45.488 |
|
- type: recall_at_5 |
|
value: 52.186 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackProgrammersRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 25.601000000000003 |
|
- type: map_at_10 |
|
value: 34.589999999999996 |
|
- type: map_at_100 |
|
value: 35.917 |
|
- type: map_at_1000 |
|
value: 36.032 |
|
- type: map_at_3 |
|
value: 31.338 |
|
- type: map_at_5 |
|
value: 33.128 |
|
- type: mrr_at_1 |
|
value: 31.163999999999998 |
|
- type: mrr_at_10 |
|
value: 39.646 |
|
- type: mrr_at_100 |
|
value: 40.491 |
|
- type: mrr_at_1000 |
|
value: 40.549 |
|
- type: mrr_at_3 |
|
value: 36.91 |
|
- type: mrr_at_5 |
|
value: 38.446000000000005 |
|
- type: ndcg_at_1 |
|
value: 31.163999999999998 |
|
- type: ndcg_at_10 |
|
value: 40.321 |
|
- type: ndcg_at_100 |
|
value: 45.894 |
|
- type: ndcg_at_1000 |
|
value: 48.233 |
|
- type: ndcg_at_3 |
|
value: 34.871 |
|
- type: ndcg_at_5 |
|
value: 37.302 |
|
- type: precision_at_1 |
|
value: 31.163999999999998 |
|
- type: precision_at_10 |
|
value: 7.523000000000001 |
|
- type: precision_at_100 |
|
value: 1.188 |
|
- type: precision_at_1000 |
|
value: 0.157 |
|
- type: precision_at_3 |
|
value: 16.591 |
|
- type: precision_at_5 |
|
value: 12.055 |
|
- type: recall_at_1 |
|
value: 25.601000000000003 |
|
- type: recall_at_10 |
|
value: 52.422000000000004 |
|
- type: recall_at_100 |
|
value: 76.426 |
|
- type: recall_at_1000 |
|
value: 92.142 |
|
- type: recall_at_3 |
|
value: 37.141000000000005 |
|
- type: recall_at_5 |
|
value: 43.449 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 26.267916666666668 |
|
- type: map_at_10 |
|
value: 35.758250000000004 |
|
- type: map_at_100 |
|
value: 37.0185 |
|
- type: map_at_1000 |
|
value: 37.136916666666664 |
|
- type: map_at_3 |
|
value: 32.85125 |
|
- type: map_at_5 |
|
value: 34.4165 |
|
- type: mrr_at_1 |
|
value: 31.131083333333343 |
|
- type: mrr_at_10 |
|
value: 39.95941666666667 |
|
- type: mrr_at_100 |
|
value: 40.81541666666666 |
|
- type: mrr_at_1000 |
|
value: 40.87358333333332 |
|
- type: mrr_at_3 |
|
value: 37.5175 |
|
- type: mrr_at_5 |
|
value: 38.86833333333334 |
|
- type: ndcg_at_1 |
|
value: 31.131083333333343 |
|
- type: ndcg_at_10 |
|
value: 41.26174999999999 |
|
- type: ndcg_at_100 |
|
value: 46.55975 |
|
- type: ndcg_at_1000 |
|
value: 48.80016666666666 |
|
- type: ndcg_at_3 |
|
value: 36.37566666666667 |
|
- type: ndcg_at_5 |
|
value: 38.55166666666667 |
|
- type: precision_at_1 |
|
value: 31.131083333333343 |
|
- type: precision_at_10 |
|
value: 7.315916666666666 |
|
- type: precision_at_100 |
|
value: 1.1813333333333333 |
|
- type: precision_at_1000 |
|
value: 0.15666666666666665 |
|
- type: precision_at_3 |
|
value: 16.818166666666663 |
|
- type: precision_at_5 |
|
value: 11.923 |
|
- type: recall_at_1 |
|
value: 26.267916666666668 |
|
- type: recall_at_10 |
|
value: 53.28391666666666 |
|
- type: recall_at_100 |
|
value: 76.53983333333332 |
|
- type: recall_at_1000 |
|
value: 91.93008333333334 |
|
- type: recall_at_3 |
|
value: 39.60583333333334 |
|
- type: recall_at_5 |
|
value: 45.25741666666667 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackStatsRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 23.372 |
|
- type: map_at_10 |
|
value: 30.916 |
|
- type: map_at_100 |
|
value: 31.980999999999998 |
|
- type: map_at_1000 |
|
value: 32.07 |
|
- type: map_at_3 |
|
value: 28.778 |
|
- type: map_at_5 |
|
value: 29.872 |
|
- type: mrr_at_1 |
|
value: 26.074 |
|
- type: mrr_at_10 |
|
value: 33.451 |
|
- type: mrr_at_100 |
|
value: 34.366 |
|
- type: mrr_at_1000 |
|
value: 34.424 |
|
- type: mrr_at_3 |
|
value: 31.569999999999997 |
|
- type: mrr_at_5 |
|
value: 32.467 |
|
- type: ndcg_at_1 |
|
value: 26.074 |
|
- type: ndcg_at_10 |
|
value: 35.119 |
|
- type: ndcg_at_100 |
|
value: 40.357 |
|
- type: ndcg_at_1000 |
|
value: 42.548 |
|
- type: ndcg_at_3 |
|
value: 31.281 |
|
- type: ndcg_at_5 |
|
value: 32.866 |
|
- type: precision_at_1 |
|
value: 26.074 |
|
- type: precision_at_10 |
|
value: 5.583 |
|
- type: precision_at_100 |
|
value: 0.899 |
|
- type: precision_at_1000 |
|
value: 0.116 |
|
- type: precision_at_3 |
|
value: 13.700999999999999 |
|
- type: precision_at_5 |
|
value: 9.447999999999999 |
|
- type: recall_at_1 |
|
value: 23.372 |
|
- type: recall_at_10 |
|
value: 45.396 |
|
- type: recall_at_100 |
|
value: 69.26 |
|
- type: recall_at_1000 |
|
value: 85.438 |
|
- type: recall_at_3 |
|
value: 34.373 |
|
- type: recall_at_5 |
|
value: 38.509 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackTexRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 17.483999999999998 |
|
- type: map_at_10 |
|
value: 25.191999999999997 |
|
- type: map_at_100 |
|
value: 26.432 |
|
- type: map_at_1000 |
|
value: 26.566000000000003 |
|
- type: map_at_3 |
|
value: 22.697 |
|
- type: map_at_5 |
|
value: 24.101 |
|
- type: mrr_at_1 |
|
value: 21.645 |
|
- type: mrr_at_10 |
|
value: 29.243000000000002 |
|
- type: mrr_at_100 |
|
value: 30.232 |
|
- type: mrr_at_1000 |
|
value: 30.312 |
|
- type: mrr_at_3 |
|
value: 26.967000000000002 |
|
- type: mrr_at_5 |
|
value: 28.262999999999998 |
|
- type: ndcg_at_1 |
|
value: 21.645 |
|
- type: ndcg_at_10 |
|
value: 30.087999999999997 |
|
- type: ndcg_at_100 |
|
value: 35.806 |
|
- type: ndcg_at_1000 |
|
value: 38.763 |
|
- type: ndcg_at_3 |
|
value: 25.746999999999996 |
|
- type: ndcg_at_5 |
|
value: 27.765 |
|
- type: precision_at_1 |
|
value: 21.645 |
|
- type: precision_at_10 |
|
value: 5.6129999999999995 |
|
- type: precision_at_100 |
|
value: 1.004 |
|
- type: precision_at_1000 |
|
value: 0.14400000000000002 |
|
- type: precision_at_3 |
|
value: 12.331 |
|
- type: precision_at_5 |
|
value: 9.009 |
|
- type: recall_at_1 |
|
value: 17.483999999999998 |
|
- type: recall_at_10 |
|
value: 40.723 |
|
- type: recall_at_100 |
|
value: 66.226 |
|
- type: recall_at_1000 |
|
value: 87.312 |
|
- type: recall_at_3 |
|
value: 28.481 |
|
- type: recall_at_5 |
|
value: 33.777 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackUnixRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 26.735 |
|
- type: map_at_10 |
|
value: 36.431000000000004 |
|
- type: map_at_100 |
|
value: 37.696000000000005 |
|
- type: map_at_1000 |
|
value: 37.793 |
|
- type: map_at_3 |
|
value: 33.416000000000004 |
|
- type: map_at_5 |
|
value: 34.934 |
|
- type: mrr_at_1 |
|
value: 31.25 |
|
- type: mrr_at_10 |
|
value: 40.516000000000005 |
|
- type: mrr_at_100 |
|
value: 41.392 |
|
- type: mrr_at_1000 |
|
value: 41.449000000000005 |
|
- type: mrr_at_3 |
|
value: 37.842 |
|
- type: mrr_at_5 |
|
value: 39.265 |
|
- type: ndcg_at_1 |
|
value: 31.25 |
|
- type: ndcg_at_10 |
|
value: 42.191 |
|
- type: ndcg_at_100 |
|
value: 47.683 |
|
- type: ndcg_at_1000 |
|
value: 49.815 |
|
- type: ndcg_at_3 |
|
value: 36.744 |
|
- type: ndcg_at_5 |
|
value: 39.007 |
|
- type: precision_at_1 |
|
value: 31.25 |
|
- type: precision_at_10 |
|
value: 7.276000000000001 |
|
- type: precision_at_100 |
|
value: 1.125 |
|
- type: precision_at_1000 |
|
value: 0.14100000000000001 |
|
- type: precision_at_3 |
|
value: 16.76 |
|
- type: precision_at_5 |
|
value: 11.791 |
|
- type: recall_at_1 |
|
value: 26.735 |
|
- type: recall_at_10 |
|
value: 55.444 |
|
- type: recall_at_100 |
|
value: 79.098 |
|
- type: recall_at_1000 |
|
value: 93.815 |
|
- type: recall_at_3 |
|
value: 40.623 |
|
- type: recall_at_5 |
|
value: 46.322 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackWebmastersRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 26.495 |
|
- type: map_at_10 |
|
value: 35.648 |
|
- type: map_at_100 |
|
value: 37.275000000000006 |
|
- type: map_at_1000 |
|
value: 37.494 |
|
- type: map_at_3 |
|
value: 32.446999999999996 |
|
- type: map_at_5 |
|
value: 34.233000000000004 |
|
- type: mrr_at_1 |
|
value: 31.225 |
|
- type: mrr_at_10 |
|
value: 40.127 |
|
- type: mrr_at_100 |
|
value: 41.092 |
|
- type: mrr_at_1000 |
|
value: 41.148 |
|
- type: mrr_at_3 |
|
value: 37.153999999999996 |
|
- type: mrr_at_5 |
|
value: 38.873999999999995 |
|
- type: ndcg_at_1 |
|
value: 31.225 |
|
- type: ndcg_at_10 |
|
value: 41.665 |
|
- type: ndcg_at_100 |
|
value: 47.557 |
|
- type: ndcg_at_1000 |
|
value: 49.992 |
|
- type: ndcg_at_3 |
|
value: 36.114000000000004 |
|
- type: ndcg_at_5 |
|
value: 38.675 |
|
- type: precision_at_1 |
|
value: 31.225 |
|
- type: precision_at_10 |
|
value: 7.904999999999999 |
|
- type: precision_at_100 |
|
value: 1.5890000000000002 |
|
- type: precision_at_1000 |
|
value: 0.246 |
|
- type: precision_at_3 |
|
value: 16.535 |
|
- type: precision_at_5 |
|
value: 12.134 |
|
- type: recall_at_1 |
|
value: 26.495 |
|
- type: recall_at_10 |
|
value: 53.727000000000004 |
|
- type: recall_at_100 |
|
value: 79.34400000000001 |
|
- type: recall_at_1000 |
|
value: 94.35900000000001 |
|
- type: recall_at_3 |
|
value: 38.432 |
|
- type: recall_at_5 |
|
value: 45.050000000000004 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: BeIR/cqadupstack |
|
name: MTEB CQADupstackWordpressRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 21.235 |
|
- type: map_at_10 |
|
value: 28.725 |
|
- type: map_at_100 |
|
value: 29.774 |
|
- type: map_at_1000 |
|
value: 29.9 |
|
- type: map_at_3 |
|
value: 26.249 |
|
- type: map_at_5 |
|
value: 27.332 |
|
- type: mrr_at_1 |
|
value: 23.29 |
|
- type: mrr_at_10 |
|
value: 30.805 |
|
- type: mrr_at_100 |
|
value: 31.730000000000004 |
|
- type: mrr_at_1000 |
|
value: 31.822 |
|
- type: mrr_at_3 |
|
value: 28.558 |
|
- type: mrr_at_5 |
|
value: 29.529 |
|
- type: ndcg_at_1 |
|
value: 23.29 |
|
- type: ndcg_at_10 |
|
value: 33.337 |
|
- type: ndcg_at_100 |
|
value: 38.494 |
|
- type: ndcg_at_1000 |
|
value: 41.414 |
|
- type: ndcg_at_3 |
|
value: 28.433999999999997 |
|
- type: ndcg_at_5 |
|
value: 30.227999999999998 |
|
- type: precision_at_1 |
|
value: 23.29 |
|
- type: precision_at_10 |
|
value: 5.323 |
|
- type: precision_at_100 |
|
value: 0.8630000000000001 |
|
- type: precision_at_1000 |
|
value: 0.123 |
|
- type: precision_at_3 |
|
value: 12.138 |
|
- type: precision_at_5 |
|
value: 8.392 |
|
- type: recall_at_1 |
|
value: 21.235 |
|
- type: recall_at_10 |
|
value: 45.653 |
|
- type: recall_at_100 |
|
value: 69.486 |
|
- type: recall_at_1000 |
|
value: 90.91799999999999 |
|
- type: recall_at_3 |
|
value: 32.123000000000005 |
|
- type: recall_at_5 |
|
value: 36.479 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: climate-fever |
|
name: MTEB ClimateFEVER |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 9.180000000000001 |
|
- type: map_at_10 |
|
value: 16.461000000000002 |
|
- type: map_at_100 |
|
value: 18.093999999999998 |
|
- type: map_at_1000 |
|
value: 18.297 |
|
- type: map_at_3 |
|
value: 13.475000000000001 |
|
- type: map_at_5 |
|
value: 15.02 |
|
- type: mrr_at_1 |
|
value: 21.303 |
|
- type: mrr_at_10 |
|
value: 31.755 |
|
- type: mrr_at_100 |
|
value: 32.826 |
|
- type: mrr_at_1000 |
|
value: 32.873000000000005 |
|
- type: mrr_at_3 |
|
value: 28.469 |
|
- type: mrr_at_5 |
|
value: 30.325999999999997 |
|
- type: ndcg_at_1 |
|
value: 21.303 |
|
- type: ndcg_at_10 |
|
value: 23.892 |
|
- type: ndcg_at_100 |
|
value: 30.848 |
|
- type: ndcg_at_1000 |
|
value: 34.577999999999996 |
|
- type: ndcg_at_3 |
|
value: 18.88 |
|
- type: ndcg_at_5 |
|
value: 20.683 |
|
- type: precision_at_1 |
|
value: 21.303 |
|
- type: precision_at_10 |
|
value: 7.693999999999999 |
|
- type: precision_at_100 |
|
value: 1.517 |
|
- type: precision_at_1000 |
|
value: 0.22 |
|
- type: precision_at_3 |
|
value: 14.180000000000001 |
|
- type: precision_at_5 |
|
value: 11.231 |
|
- type: recall_at_1 |
|
value: 9.180000000000001 |
|
- type: recall_at_10 |
|
value: 29.813000000000002 |
|
- type: recall_at_100 |
|
value: 54.116 |
|
- type: recall_at_1000 |
|
value: 75.248 |
|
- type: recall_at_3 |
|
value: 17.684 |
|
- type: recall_at_5 |
|
value: 22.557 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: dbpedia-entity |
|
name: MTEB DBPedia |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 8.508000000000001 |
|
- type: map_at_10 |
|
value: 16.39 |
|
- type: map_at_100 |
|
value: 21.981 |
|
- type: map_at_1000 |
|
value: 23.253 |
|
- type: map_at_3 |
|
value: 12.465 |
|
- type: map_at_5 |
|
value: 14.194999999999999 |
|
- type: mrr_at_1 |
|
value: 60.0 |
|
- type: mrr_at_10 |
|
value: 68.499 |
|
- type: mrr_at_100 |
|
value: 69.014 |
|
- type: mrr_at_1000 |
|
value: 69.024 |
|
- type: mrr_at_3 |
|
value: 66.625 |
|
- type: mrr_at_5 |
|
value: 67.887 |
|
- type: ndcg_at_1 |
|
value: 48.5 |
|
- type: ndcg_at_10 |
|
value: 34.870000000000005 |
|
- type: ndcg_at_100 |
|
value: 38.448 |
|
- type: ndcg_at_1000 |
|
value: 45.668 |
|
- type: ndcg_at_3 |
|
value: 39.931 |
|
- type: ndcg_at_5 |
|
value: 37.007 |
|
- type: precision_at_1 |
|
value: 60.0 |
|
- type: precision_at_10 |
|
value: 26.924999999999997 |
|
- type: precision_at_100 |
|
value: 8.358 |
|
- type: precision_at_1000 |
|
value: 1.7850000000000001 |
|
- type: precision_at_3 |
|
value: 43.0 |
|
- type: precision_at_5 |
|
value: 35.449999999999996 |
|
- type: recall_at_1 |
|
value: 8.508000000000001 |
|
- type: recall_at_10 |
|
value: 21.089 |
|
- type: recall_at_100 |
|
value: 43.146 |
|
- type: recall_at_1000 |
|
value: 66.776 |
|
- type: recall_at_3 |
|
value: 13.33 |
|
- type: recall_at_5 |
|
value: 16.225 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/emotion |
|
name: MTEB EmotionClassification |
|
config: default |
|
split: test |
|
revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37 |
|
metrics: |
|
- type: accuracy |
|
value: 46.735 |
|
- type: f1 |
|
value: 42.30853263256299 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: fever |
|
name: MTEB FEVER |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 54.54 |
|
- type: map_at_10 |
|
value: 65.24600000000001 |
|
- type: map_at_100 |
|
value: 65.69 |
|
- type: map_at_1000 |
|
value: 65.71000000000001 |
|
- type: map_at_3 |
|
value: 63.234 |
|
- type: map_at_5 |
|
value: 64.455 |
|
- type: mrr_at_1 |
|
value: 58.821 |
|
- type: mrr_at_10 |
|
value: 69.616 |
|
- type: mrr_at_100 |
|
value: 69.98 |
|
- type: mrr_at_1000 |
|
value: 69.992 |
|
- type: mrr_at_3 |
|
value: 67.782 |
|
- type: mrr_at_5 |
|
value: 68.917 |
|
- type: ndcg_at_1 |
|
value: 58.821 |
|
- type: ndcg_at_10 |
|
value: 70.798 |
|
- type: ndcg_at_100 |
|
value: 72.719 |
|
- type: ndcg_at_1000 |
|
value: 73.19600000000001 |
|
- type: ndcg_at_3 |
|
value: 67.037 |
|
- type: ndcg_at_5 |
|
value: 69.048 |
|
- type: precision_at_1 |
|
value: 58.821 |
|
- type: precision_at_10 |
|
value: 9.182 |
|
- type: precision_at_100 |
|
value: 1.024 |
|
- type: precision_at_1000 |
|
value: 0.108 |
|
- type: precision_at_3 |
|
value: 26.662999999999997 |
|
- type: precision_at_5 |
|
value: 17.159 |
|
- type: recall_at_1 |
|
value: 54.54 |
|
- type: recall_at_10 |
|
value: 83.67999999999999 |
|
- type: recall_at_100 |
|
value: 92.099 |
|
- type: recall_at_1000 |
|
value: 95.532 |
|
- type: recall_at_3 |
|
value: 73.478 |
|
- type: recall_at_5 |
|
value: 78.424 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: fiqa |
|
name: MTEB FiQA2018 |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 17.601 |
|
- type: map_at_10 |
|
value: 28.676000000000002 |
|
- type: map_at_100 |
|
value: 30.463 |
|
- type: map_at_1000 |
|
value: 30.666 |
|
- type: map_at_3 |
|
value: 24.734 |
|
- type: map_at_5 |
|
value: 27.026 |
|
- type: mrr_at_1 |
|
value: 34.259 |
|
- type: mrr_at_10 |
|
value: 43.613 |
|
- type: mrr_at_100 |
|
value: 44.535000000000004 |
|
- type: mrr_at_1000 |
|
value: 44.583 |
|
- type: mrr_at_3 |
|
value: 41.307 |
|
- type: mrr_at_5 |
|
value: 42.626 |
|
- type: ndcg_at_1 |
|
value: 34.259 |
|
- type: ndcg_at_10 |
|
value: 36.097 |
|
- type: ndcg_at_100 |
|
value: 43.039 |
|
- type: ndcg_at_1000 |
|
value: 46.498 |
|
- type: ndcg_at_3 |
|
value: 32.244 |
|
- type: ndcg_at_5 |
|
value: 33.711999999999996 |
|
- type: precision_at_1 |
|
value: 34.259 |
|
- type: precision_at_10 |
|
value: 10.030999999999999 |
|
- type: precision_at_100 |
|
value: 1.7239999999999998 |
|
- type: precision_at_1000 |
|
value: 0.234 |
|
- type: precision_at_3 |
|
value: 21.193 |
|
- type: precision_at_5 |
|
value: 15.956999999999999 |
|
- type: recall_at_1 |
|
value: 17.601 |
|
- type: recall_at_10 |
|
value: 42.807 |
|
- type: recall_at_100 |
|
value: 68.571 |
|
- type: recall_at_1000 |
|
value: 89.237 |
|
- type: recall_at_3 |
|
value: 29.301 |
|
- type: recall_at_5 |
|
value: 35.528999999999996 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: hotpotqa |
|
name: MTEB HotpotQA |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 31.182 |
|
- type: map_at_10 |
|
value: 42.631 |
|
- type: map_at_100 |
|
value: 43.577 |
|
- type: map_at_1000 |
|
value: 43.661 |
|
- type: map_at_3 |
|
value: 40.06 |
|
- type: map_at_5 |
|
value: 41.591 |
|
- type: mrr_at_1 |
|
value: 62.363 |
|
- type: mrr_at_10 |
|
value: 69.047 |
|
- type: mrr_at_100 |
|
value: 69.46 |
|
- type: mrr_at_1000 |
|
value: 69.48100000000001 |
|
- type: mrr_at_3 |
|
value: 67.574 |
|
- type: mrr_at_5 |
|
value: 68.487 |
|
- type: ndcg_at_1 |
|
value: 62.363 |
|
- type: ndcg_at_10 |
|
value: 51.629999999999995 |
|
- type: ndcg_at_100 |
|
value: 55.301 |
|
- type: ndcg_at_1000 |
|
value: 57.071000000000005 |
|
- type: ndcg_at_3 |
|
value: 47.496 |
|
- type: ndcg_at_5 |
|
value: 49.687 |
|
- type: precision_at_1 |
|
value: 62.363 |
|
- type: precision_at_10 |
|
value: 10.628 |
|
- type: precision_at_100 |
|
value: 1.352 |
|
- type: precision_at_1000 |
|
value: 0.159 |
|
- type: precision_at_3 |
|
value: 29.296 |
|
- type: precision_at_5 |
|
value: 19.309 |
|
- type: recall_at_1 |
|
value: 31.182 |
|
- type: recall_at_10 |
|
value: 53.14 |
|
- type: recall_at_100 |
|
value: 67.596 |
|
- type: recall_at_1000 |
|
value: 79.372 |
|
- type: recall_at_3 |
|
value: 43.943 |
|
- type: recall_at_5 |
|
value: 48.271 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/imdb |
|
name: MTEB ImdbClassification |
|
config: default |
|
split: test |
|
revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7 |
|
metrics: |
|
- type: accuracy |
|
value: 71.55319999999999 |
|
- type: ap |
|
value: 65.44170899953346 |
|
- type: f1 |
|
value: 71.33420141354401 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: msmarco |
|
name: MTEB MSMARCO |
|
config: default |
|
split: dev |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 18.89 |
|
- type: map_at_10 |
|
value: 30.076999999999998 |
|
- type: map_at_100 |
|
value: 31.281 |
|
- type: map_at_1000 |
|
value: 31.341 |
|
- type: map_at_3 |
|
value: 26.391 |
|
- type: map_at_5 |
|
value: 28.557 |
|
- type: mrr_at_1 |
|
value: 19.312 |
|
- type: mrr_at_10 |
|
value: 30.566 |
|
- type: mrr_at_100 |
|
value: 31.728 |
|
- type: mrr_at_1000 |
|
value: 31.781 |
|
- type: mrr_at_3 |
|
value: 26.901000000000003 |
|
- type: mrr_at_5 |
|
value: 29.072 |
|
- type: ndcg_at_1 |
|
value: 19.326999999999998 |
|
- type: ndcg_at_10 |
|
value: 36.516999999999996 |
|
- type: ndcg_at_100 |
|
value: 42.458 |
|
- type: ndcg_at_1000 |
|
value: 43.99 |
|
- type: ndcg_at_3 |
|
value: 29.005 |
|
- type: ndcg_at_5 |
|
value: 32.889 |
|
- type: precision_at_1 |
|
value: 19.326999999999998 |
|
- type: precision_at_10 |
|
value: 5.868 |
|
- type: precision_at_100 |
|
value: 0.8880000000000001 |
|
- type: precision_at_1000 |
|
value: 0.10200000000000001 |
|
- type: precision_at_3 |
|
value: 12.388 |
|
- type: precision_at_5 |
|
value: 9.401 |
|
- type: recall_at_1 |
|
value: 18.89 |
|
- type: recall_at_10 |
|
value: 56.442 |
|
- type: recall_at_100 |
|
value: 84.16 |
|
- type: recall_at_1000 |
|
value: 95.97099999999999 |
|
- type: recall_at_3 |
|
value: 36.077999999999996 |
|
- type: recall_at_5 |
|
value: 45.395 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/mtop_domain |
|
name: MTEB MTOPDomainClassification (en) |
|
config: en |
|
split: test |
|
revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf |
|
metrics: |
|
- type: accuracy |
|
value: 93.69585043319653 |
|
- type: f1 |
|
value: 93.27706251110098 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/mtop_intent |
|
name: MTEB MTOPIntentClassification (en) |
|
config: en |
|
split: test |
|
revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba |
|
metrics: |
|
- type: accuracy |
|
value: 74.62836297309622 |
|
- type: f1 |
|
value: 56.21163652384411 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/amazon_massive_intent |
|
name: MTEB MassiveIntentClassification (en) |
|
config: en |
|
split: test |
|
revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 |
|
metrics: |
|
- type: accuracy |
|
value: 71.37861466039006 |
|
- type: f1 |
|
value: 69.85338860172736 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/amazon_massive_scenario |
|
name: MTEB MassiveScenarioClassification (en) |
|
config: en |
|
split: test |
|
revision: 7d571f92784cd94a019292a1f45445077d0ef634 |
|
metrics: |
|
- type: accuracy |
|
value: 75.58170813718897 |
|
- type: f1 |
|
value: 75.77358464349743 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/medrxiv-clustering-p2p |
|
name: MTEB MedrxivClusteringP2P |
|
config: default |
|
split: test |
|
revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73 |
|
metrics: |
|
- type: v_measure |
|
value: 33.29659845527655 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/medrxiv-clustering-s2s |
|
name: MTEB MedrxivClusteringS2S |
|
config: default |
|
split: test |
|
revision: 35191c8c0dca72d8ff3efcd72aa802307d469663 |
|
metrics: |
|
- type: v_measure |
|
value: 29.97507851301835 |
|
- task: |
|
type: Reranking |
|
dataset: |
|
type: mteb/mind_small |
|
name: MTEB MindSmallReranking |
|
config: default |
|
split: test |
|
revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69 |
|
metrics: |
|
- type: map |
|
value: 31.158968289313076 |
|
- type: mrr |
|
value: 32.27027446726339 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: nfcorpus |
|
name: MTEB NFCorpus |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 5.021 |
|
- type: map_at_10 |
|
value: 11.346 |
|
- type: map_at_100 |
|
value: 14.457 |
|
- type: map_at_1000 |
|
value: 15.875 |
|
- type: map_at_3 |
|
value: 8.376999999999999 |
|
- type: map_at_5 |
|
value: 9.793000000000001 |
|
- type: mrr_at_1 |
|
value: 43.344 |
|
- type: mrr_at_10 |
|
value: 51.266 |
|
- type: mrr_at_100 |
|
value: 51.871 |
|
- type: mrr_at_1000 |
|
value: 51.915 |
|
- type: mrr_at_3 |
|
value: 49.174 |
|
- type: mrr_at_5 |
|
value: 50.475 |
|
- type: ndcg_at_1 |
|
value: 41.331 |
|
- type: ndcg_at_10 |
|
value: 31.257 |
|
- type: ndcg_at_100 |
|
value: 29.264000000000003 |
|
- type: ndcg_at_1000 |
|
value: 38.024 |
|
- type: ndcg_at_3 |
|
value: 36.643 |
|
- type: ndcg_at_5 |
|
value: 34.808 |
|
- type: precision_at_1 |
|
value: 43.034 |
|
- type: precision_at_10 |
|
value: 22.972 |
|
- type: precision_at_100 |
|
value: 7.576 |
|
- type: precision_at_1000 |
|
value: 2.0629999999999997 |
|
- type: precision_at_3 |
|
value: 34.572 |
|
- type: precision_at_5 |
|
value: 30.341 |
|
- type: recall_at_1 |
|
value: 5.021 |
|
- type: recall_at_10 |
|
value: 15.197 |
|
- type: recall_at_100 |
|
value: 30.874000000000002 |
|
- type: recall_at_1000 |
|
value: 61.934 |
|
- type: recall_at_3 |
|
value: 9.467 |
|
- type: recall_at_5 |
|
value: 11.904 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: nq |
|
name: MTEB NQ |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 24.468999999999998 |
|
- type: map_at_10 |
|
value: 38.885999999999996 |
|
- type: map_at_100 |
|
value: 40.154 |
|
- type: map_at_1000 |
|
value: 40.195 |
|
- type: map_at_3 |
|
value: 34.565 |
|
- type: map_at_5 |
|
value: 37.069 |
|
- type: mrr_at_1 |
|
value: 27.578000000000003 |
|
- type: mrr_at_10 |
|
value: 41.079 |
|
- type: mrr_at_100 |
|
value: 42.081 |
|
- type: mrr_at_1000 |
|
value: 42.109 |
|
- type: mrr_at_3 |
|
value: 37.278 |
|
- type: mrr_at_5 |
|
value: 39.585 |
|
- type: ndcg_at_1 |
|
value: 27.549 |
|
- type: ndcg_at_10 |
|
value: 46.506 |
|
- type: ndcg_at_100 |
|
value: 51.92400000000001 |
|
- type: ndcg_at_1000 |
|
value: 52.833 |
|
- type: ndcg_at_3 |
|
value: 38.214999999999996 |
|
- type: ndcg_at_5 |
|
value: 42.498000000000005 |
|
- type: precision_at_1 |
|
value: 27.549 |
|
- type: precision_at_10 |
|
value: 8.019 |
|
- type: precision_at_100 |
|
value: 1.103 |
|
- type: precision_at_1000 |
|
value: 0.11900000000000001 |
|
- type: precision_at_3 |
|
value: 17.806 |
|
- type: precision_at_5 |
|
value: 13.100000000000001 |
|
- type: recall_at_1 |
|
value: 24.468999999999998 |
|
- type: recall_at_10 |
|
value: 67.632 |
|
- type: recall_at_100 |
|
value: 91.169 |
|
- type: recall_at_1000 |
|
value: 97.851 |
|
- type: recall_at_3 |
|
value: 46.043 |
|
- type: recall_at_5 |
|
value: 55.962999999999994 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: quora |
|
name: MTEB QuoraRetrieval |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 70.44 |
|
- type: map_at_10 |
|
value: 84.209 |
|
- type: map_at_100 |
|
value: 84.868 |
|
- type: map_at_1000 |
|
value: 84.884 |
|
- type: map_at_3 |
|
value: 81.192 |
|
- type: map_at_5 |
|
value: 83.06099999999999 |
|
- type: mrr_at_1 |
|
value: 81.12 |
|
- type: mrr_at_10 |
|
value: 87.30499999999999 |
|
- type: mrr_at_100 |
|
value: 87.413 |
|
- type: mrr_at_1000 |
|
value: 87.414 |
|
- type: mrr_at_3 |
|
value: 86.337 |
|
- type: mrr_at_5 |
|
value: 86.985 |
|
- type: ndcg_at_1 |
|
value: 81.15 |
|
- type: ndcg_at_10 |
|
value: 88.032 |
|
- type: ndcg_at_100 |
|
value: 89.292 |
|
- type: ndcg_at_1000 |
|
value: 89.393 |
|
- type: ndcg_at_3 |
|
value: 85.098 |
|
- type: ndcg_at_5 |
|
value: 86.691 |
|
- type: precision_at_1 |
|
value: 81.15 |
|
- type: precision_at_10 |
|
value: 13.395999999999999 |
|
- type: precision_at_100 |
|
value: 1.5310000000000001 |
|
- type: precision_at_1000 |
|
value: 0.157 |
|
- type: precision_at_3 |
|
value: 37.16 |
|
- type: precision_at_5 |
|
value: 24.458 |
|
- type: recall_at_1 |
|
value: 70.44 |
|
- type: recall_at_10 |
|
value: 95.204 |
|
- type: recall_at_100 |
|
value: 99.506 |
|
- type: recall_at_1000 |
|
value: 99.978 |
|
- type: recall_at_3 |
|
value: 86.83999999999999 |
|
- type: recall_at_5 |
|
value: 91.328 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/reddit-clustering |
|
name: MTEB RedditClustering |
|
config: default |
|
split: test |
|
revision: 24640382cdbf8abc73003fb0fa6d111a705499eb |
|
metrics: |
|
- type: v_measure |
|
value: 44.091918771223966 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/reddit-clustering-p2p |
|
name: MTEB RedditClusteringP2P |
|
config: default |
|
split: test |
|
revision: 282350215ef01743dc01b456c7f5241fa8937f16 |
|
metrics: |
|
- type: v_measure |
|
value: 49.3850718319815 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: scidocs |
|
name: MTEB SCIDOCS |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 5.108 |
|
- type: map_at_10 |
|
value: 12.878 |
|
- type: map_at_100 |
|
value: 15.398 |
|
- type: map_at_1000 |
|
value: 15.762 |
|
- type: map_at_3 |
|
value: 9.028 |
|
- type: map_at_5 |
|
value: 10.886 |
|
- type: mrr_at_1 |
|
value: 25.2 |
|
- type: mrr_at_10 |
|
value: 36.051 |
|
- type: mrr_at_100 |
|
value: 37.198 |
|
- type: mrr_at_1000 |
|
value: 37.254 |
|
- type: mrr_at_3 |
|
value: 32.483000000000004 |
|
- type: mrr_at_5 |
|
value: 34.583000000000006 |
|
- type: ndcg_at_1 |
|
value: 25.2 |
|
- type: ndcg_at_10 |
|
value: 21.436 |
|
- type: ndcg_at_100 |
|
value: 30.758000000000003 |
|
- type: ndcg_at_1000 |
|
value: 36.774 |
|
- type: ndcg_at_3 |
|
value: 19.977 |
|
- type: ndcg_at_5 |
|
value: 17.634 |
|
- type: precision_at_1 |
|
value: 25.2 |
|
- type: precision_at_10 |
|
value: 11.16 |
|
- type: precision_at_100 |
|
value: 2.46 |
|
- type: precision_at_1000 |
|
value: 0.38999999999999996 |
|
- type: precision_at_3 |
|
value: 18.4 |
|
- type: precision_at_5 |
|
value: 15.440000000000001 |
|
- type: recall_at_1 |
|
value: 5.108 |
|
- type: recall_at_10 |
|
value: 22.615 |
|
- type: recall_at_100 |
|
value: 49.838 |
|
- type: recall_at_1000 |
|
value: 79.12700000000001 |
|
- type: recall_at_3 |
|
value: 11.203000000000001 |
|
- type: recall_at_5 |
|
value: 15.638 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/sickr-sts |
|
name: MTEB SICK-R |
|
config: default |
|
split: test |
|
revision: a6ea5a8cab320b040a23452cc28066d9beae2cee |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 84.87907802108278 |
|
- type: cos_sim_spearman |
|
value: 78.47745630820519 |
|
- type: euclidean_pearson |
|
value: 81.24598854050433 |
|
- type: euclidean_spearman |
|
value: 76.49536405466311 |
|
- type: manhattan_pearson |
|
value: 81.2143517198192 |
|
- type: manhattan_spearman |
|
value: 76.41735187637899 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/sts12-sts |
|
name: MTEB STS12 |
|
config: default |
|
split: test |
|
revision: a0d554a64d88156834ff5ae9920b964011b16384 |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 84.72222146895906 |
|
- type: cos_sim_spearman |
|
value: 75.78345138703104 |
|
- type: euclidean_pearson |
|
value: 81.35072741369821 |
|
- type: euclidean_spearman |
|
value: 71.44372390021385 |
|
- type: manhattan_pearson |
|
value: 81.42777992212991 |
|
- type: manhattan_spearman |
|
value: 71.50748732911025 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/sts13-sts |
|
name: MTEB STS13 |
|
config: default |
|
split: test |
|
revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 82.46314178714782 |
|
- type: cos_sim_spearman |
|
value: 83.30487501773337 |
|
- type: euclidean_pearson |
|
value: 81.97496753880277 |
|
- type: euclidean_spearman |
|
value: 83.26569157819903 |
|
- type: manhattan_pearson |
|
value: 81.95087299528338 |
|
- type: manhattan_spearman |
|
value: 83.25657383286989 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/sts14-sts |
|
name: MTEB STS14 |
|
config: default |
|
split: test |
|
revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375 |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 82.38192118423038 |
|
- type: cos_sim_spearman |
|
value: 78.40410104736917 |
|
- type: euclidean_pearson |
|
value: 79.48941144435967 |
|
- type: euclidean_spearman |
|
value: 76.87243228899331 |
|
- type: manhattan_pearson |
|
value: 79.37383745954276 |
|
- type: manhattan_spearman |
|
value: 76.81624170740595 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/sts15-sts |
|
name: MTEB STS15 |
|
config: default |
|
split: test |
|
revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3 |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 84.89499997364136 |
|
- type: cos_sim_spearman |
|
value: 86.49722400765071 |
|
- type: euclidean_pearson |
|
value: 80.83327622391033 |
|
- type: euclidean_spearman |
|
value: 81.77906221038033 |
|
- type: manhattan_pearson |
|
value: 80.68927444298423 |
|
- type: manhattan_spearman |
|
value: 81.67585996918764 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/sts16-sts |
|
name: MTEB STS16 |
|
config: default |
|
split: test |
|
revision: 4d8694f8f0e0100860b497b999b3dbed754a0513 |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 80.85434430333662 |
|
- type: cos_sim_spearman |
|
value: 82.32641704038703 |
|
- type: euclidean_pearson |
|
value: 78.92319495883405 |
|
- type: euclidean_spearman |
|
value: 80.06748121443441 |
|
- type: manhattan_pearson |
|
value: 78.68188267117745 |
|
- type: manhattan_spearman |
|
value: 79.72019793896195 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/sts17-crosslingual-sts |
|
name: MTEB STS17 (en-en) |
|
config: en-en |
|
split: test |
|
revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 87.0896689258414 |
|
- type: cos_sim_spearman |
|
value: 87.31114069713735 |
|
- type: euclidean_pearson |
|
value: 83.93671908621272 |
|
- type: euclidean_spearman |
|
value: 82.83918654090873 |
|
- type: manhattan_pearson |
|
value: 83.5943550673816 |
|
- type: manhattan_spearman |
|
value: 82.47327946394148 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/sts22-crosslingual-sts |
|
name: MTEB STS22 (en) |
|
config: en |
|
split: test |
|
revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80 |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 66.4799391480602 |
|
- type: cos_sim_spearman |
|
value: 66.59141182659532 |
|
- type: euclidean_pearson |
|
value: 45.85714541149068 |
|
- type: euclidean_spearman |
|
value: 61.605252732946404 |
|
- type: manhattan_pearson |
|
value: 46.69415667711241 |
|
- type: manhattan_spearman |
|
value: 61.38490967409539 |
|
- task: |
|
type: STS |
|
dataset: |
|
type: mteb/stsbenchmark-sts |
|
name: MTEB STSBenchmark |
|
config: default |
|
split: test |
|
revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831 |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 82.22064334651283 |
|
- type: cos_sim_spearman |
|
value: 84.23556405551305 |
|
- type: euclidean_pearson |
|
value: 80.64484589022672 |
|
- type: euclidean_spearman |
|
value: 80.27585966983669 |
|
- type: manhattan_pearson |
|
value: 80.44248540454653 |
|
- type: manhattan_spearman |
|
value: 80.06071452831723 |
|
- task: |
|
type: Reranking |
|
dataset: |
|
type: mteb/scidocs-reranking |
|
name: MTEB SciDocsRR |
|
config: default |
|
split: test |
|
revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab |
|
metrics: |
|
- type: map |
|
value: 86.82632940766443 |
|
- type: mrr |
|
value: 96.27367186190715 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: scifact |
|
name: MTEB SciFact |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 48.443999999999996 |
|
- type: map_at_10 |
|
value: 58.309 |
|
- type: map_at_100 |
|
value: 59.116 |
|
- type: map_at_1000 |
|
value: 59.155 |
|
- type: map_at_3 |
|
value: 55.598000000000006 |
|
- type: map_at_5 |
|
value: 57.550999999999995 |
|
- type: mrr_at_1 |
|
value: 50.666999999999994 |
|
- type: mrr_at_10 |
|
value: 59.099000000000004 |
|
- type: mrr_at_100 |
|
value: 59.843 |
|
- type: mrr_at_1000 |
|
value: 59.879000000000005 |
|
- type: mrr_at_3 |
|
value: 57.167 |
|
- type: mrr_at_5 |
|
value: 58.5 |
|
- type: ndcg_at_1 |
|
value: 50.666999999999994 |
|
- type: ndcg_at_10 |
|
value: 62.483999999999995 |
|
- type: ndcg_at_100 |
|
value: 66.131 |
|
- type: ndcg_at_1000 |
|
value: 67.17 |
|
- type: ndcg_at_3 |
|
value: 58.07299999999999 |
|
- type: ndcg_at_5 |
|
value: 60.87200000000001 |
|
- type: precision_at_1 |
|
value: 50.666999999999994 |
|
- type: precision_at_10 |
|
value: 8.4 |
|
- type: precision_at_100 |
|
value: 1.0330000000000001 |
|
- type: precision_at_1000 |
|
value: 0.11199999999999999 |
|
- type: precision_at_3 |
|
value: 22.889 |
|
- type: precision_at_5 |
|
value: 15.467 |
|
- type: recall_at_1 |
|
value: 48.443999999999996 |
|
- type: recall_at_10 |
|
value: 74.26700000000001 |
|
- type: recall_at_100 |
|
value: 90.5 |
|
- type: recall_at_1000 |
|
value: 98.667 |
|
- type: recall_at_3 |
|
value: 63.039 |
|
- type: recall_at_5 |
|
value: 69.706 |
|
- task: |
|
type: PairClassification |
|
dataset: |
|
type: mteb/sprintduplicatequestions-pairclassification |
|
name: MTEB SprintDuplicateQuestions |
|
config: default |
|
split: test |
|
revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46 |
|
metrics: |
|
- type: cos_sim_accuracy |
|
value: 99.76336633663367 |
|
- type: cos_sim_ap |
|
value: 94.05677361006421 |
|
- type: cos_sim_f1 |
|
value: 87.85894206549118 |
|
- type: cos_sim_precision |
|
value: 88.52791878172589 |
|
- type: cos_sim_recall |
|
value: 87.2 |
|
- type: dot_accuracy |
|
value: 99.06732673267327 |
|
- type: dot_ap |
|
value: 25.234902506145275 |
|
- type: dot_f1 |
|
value: 31.687715269804816 |
|
- type: dot_precision |
|
value: 37.19676549865229 |
|
- type: dot_recall |
|
value: 27.6 |
|
- type: euclidean_accuracy |
|
value: 99.73861386138614 |
|
- type: euclidean_ap |
|
value: 92.39504711224613 |
|
- type: euclidean_f1 |
|
value: 86.40576725025747 |
|
- type: euclidean_precision |
|
value: 89.06581740976645 |
|
- type: euclidean_recall |
|
value: 83.89999999999999 |
|
- type: manhattan_accuracy |
|
value: 99.74455445544554 |
|
- type: manhattan_ap |
|
value: 92.5050066340186 |
|
- type: manhattan_f1 |
|
value: 86.67355371900827 |
|
- type: manhattan_precision |
|
value: 89.63675213675214 |
|
- type: manhattan_recall |
|
value: 83.89999999999999 |
|
- type: max_accuracy |
|
value: 99.76336633663367 |
|
- type: max_ap |
|
value: 94.05677361006421 |
|
- type: max_f1 |
|
value: 87.85894206549118 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/stackexchange-clustering |
|
name: MTEB StackExchangeClustering |
|
config: default |
|
split: test |
|
revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259 |
|
metrics: |
|
- type: v_measure |
|
value: 52.66315650755836 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/stackexchange-clustering-p2p |
|
name: MTEB StackExchangeClusteringP2P |
|
config: default |
|
split: test |
|
revision: 815ca46b2622cec33ccafc3735d572c266efdb44 |
|
metrics: |
|
- type: v_measure |
|
value: 32.36019149648443 |
|
- task: |
|
type: Reranking |
|
dataset: |
|
type: mteb/stackoverflowdupquestions-reranking |
|
name: MTEB StackOverflowDupQuestions |
|
config: default |
|
split: test |
|
revision: e185fbe320c72810689fc5848eb6114e1ef5ec69 |
|
metrics: |
|
- type: map |
|
value: 50.10933600138655 |
|
- type: mrr |
|
value: 50.84273671589848 |
|
- task: |
|
type: Summarization |
|
dataset: |
|
type: mteb/summeval |
|
name: MTEB SummEval |
|
config: default |
|
split: test |
|
revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c |
|
metrics: |
|
- type: cos_sim_pearson |
|
value: 30.342194052503917 |
|
- type: cos_sim_spearman |
|
value: 30.74326118928312 |
|
- type: dot_pearson |
|
value: 12.329727800033176 |
|
- type: dot_spearman |
|
value: 14.54557726626662 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: trec-covid |
|
name: MTEB TRECCOVID |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 0.173 |
|
- type: map_at_10 |
|
value: 1.1320000000000001 |
|
- type: map_at_100 |
|
value: 5.885 |
|
- type: map_at_1000 |
|
value: 14.762 |
|
- type: map_at_3 |
|
value: 0.443 |
|
- type: map_at_5 |
|
value: 0.66 |
|
- type: mrr_at_1 |
|
value: 66.0 |
|
- type: mrr_at_10 |
|
value: 76.34100000000001 |
|
- type: mrr_at_100 |
|
value: 76.37 |
|
- type: mrr_at_1000 |
|
value: 76.376 |
|
- type: mrr_at_3 |
|
value: 74.667 |
|
- type: mrr_at_5 |
|
value: 74.667 |
|
- type: ndcg_at_1 |
|
value: 59.0 |
|
- type: ndcg_at_10 |
|
value: 50.047 |
|
- type: ndcg_at_100 |
|
value: 37.744 |
|
- type: ndcg_at_1000 |
|
value: 35.903 |
|
- type: ndcg_at_3 |
|
value: 55.95 |
|
- type: ndcg_at_5 |
|
value: 53.379 |
|
- type: precision_at_1 |
|
value: 66.0 |
|
- type: precision_at_10 |
|
value: 53.0 |
|
- type: precision_at_100 |
|
value: 38.78 |
|
- type: precision_at_1000 |
|
value: 16.24 |
|
- type: precision_at_3 |
|
value: 60.0 |
|
- type: precision_at_5 |
|
value: 56.39999999999999 |
|
- type: recall_at_1 |
|
value: 0.173 |
|
- type: recall_at_10 |
|
value: 1.379 |
|
- type: recall_at_100 |
|
value: 9.196 |
|
- type: recall_at_1000 |
|
value: 34.488 |
|
- type: recall_at_3 |
|
value: 0.475 |
|
- type: recall_at_5 |
|
value: 0.738 |
|
- task: |
|
type: Retrieval |
|
dataset: |
|
type: webis-touche2020 |
|
name: MTEB Touche2020 |
|
config: default |
|
split: test |
|
revision: None |
|
metrics: |
|
- type: map_at_1 |
|
value: 2.1260000000000003 |
|
- type: map_at_10 |
|
value: 7.216 |
|
- type: map_at_100 |
|
value: 12.732 |
|
- type: map_at_1000 |
|
value: 14.158999999999999 |
|
- type: map_at_3 |
|
value: 3.9530000000000003 |
|
- type: map_at_5 |
|
value: 5.252 |
|
- type: mrr_at_1 |
|
value: 24.490000000000002 |
|
- type: mrr_at_10 |
|
value: 36.949 |
|
- type: mrr_at_100 |
|
value: 38.0 |
|
- type: mrr_at_1000 |
|
value: 38.0 |
|
- type: mrr_at_3 |
|
value: 31.973000000000003 |
|
- type: mrr_at_5 |
|
value: 34.32 |
|
- type: ndcg_at_1 |
|
value: 19.387999999999998 |
|
- type: ndcg_at_10 |
|
value: 17.918 |
|
- type: ndcg_at_100 |
|
value: 30.558999999999997 |
|
- type: ndcg_at_1000 |
|
value: 42.028 |
|
- type: ndcg_at_3 |
|
value: 17.202 |
|
- type: ndcg_at_5 |
|
value: 17.788 |
|
- type: precision_at_1 |
|
value: 24.490000000000002 |
|
- type: precision_at_10 |
|
value: 17.347 |
|
- type: precision_at_100 |
|
value: 6.918 |
|
- type: precision_at_1000 |
|
value: 1.4569999999999999 |
|
- type: precision_at_3 |
|
value: 19.728 |
|
- type: precision_at_5 |
|
value: 19.592000000000002 |
|
- type: recall_at_1 |
|
value: 2.1260000000000003 |
|
- type: recall_at_10 |
|
value: 12.897 |
|
- type: recall_at_100 |
|
value: 42.632999999999996 |
|
- type: recall_at_1000 |
|
value: 77.783 |
|
- type: recall_at_3 |
|
value: 4.836 |
|
- type: recall_at_5 |
|
value: 7.331 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/toxic_conversations_50k |
|
name: MTEB ToxicConversationsClassification |
|
config: default |
|
split: test |
|
revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c |
|
metrics: |
|
- type: accuracy |
|
value: 70.9516 |
|
- type: ap |
|
value: 14.148097836321893 |
|
- type: f1 |
|
value: 54.52189833022899 |
|
- task: |
|
type: Classification |
|
dataset: |
|
type: mteb/tweet_sentiment_extraction |
|
name: MTEB TweetSentimentExtractionClassification |
|
config: default |
|
split: test |
|
revision: d604517c81ca91fe16a244d1248fc021f9ecee7a |
|
metrics: |
|
- type: accuracy |
|
value: 58.33899264289756 |
|
- type: f1 |
|
value: 58.684516042056565 |
|
- task: |
|
type: Clustering |
|
dataset: |
|
type: mteb/twentynewsgroups-clustering |
|
name: MTEB TwentyNewsgroupsClustering |
|
config: default |
|
split: test |
|
revision: 6125ec4e24fa026cec8a478383ee943acfbd5449 |
|
metrics: |
|
- type: v_measure |
|
value: 41.45569187892743 |
|
- task: |
|
type: PairClassification |
|
dataset: |
|
type: mteb/twittersemeval2015-pairclassification |
|
name: MTEB TwitterSemEval2015 |
|
config: default |
|
split: test |
|
revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1 |
|
metrics: |
|
- type: cos_sim_accuracy |
|
value: 85.05692316862371 |
|
- type: cos_sim_ap |
|
value: 70.54785019750204 |
|
- type: cos_sim_f1 |
|
value: 65.99060103883255 |
|
- type: cos_sim_precision |
|
value: 62.10428305400373 |
|
- type: cos_sim_recall |
|
value: 70.3957783641161 |
|
- type: dot_accuracy |
|
value: 77.82678667222984 |
|
- type: dot_ap |
|
value: 32.73452779849359 |
|
- type: dot_f1 |
|
value: 38.1269911832259 |
|
- type: dot_precision |
|
value: 26.5066446893994 |
|
- type: dot_recall |
|
value: 67.8891820580475 |
|
- type: euclidean_accuracy |
|
value: 84.62180365977231 |
|
- type: euclidean_ap |
|
value: 68.57434108453688 |
|
- type: euclidean_f1 |
|
value: 65.23069391751316 |
|
- type: euclidean_precision |
|
value: 60.83086053412463 |
|
- type: euclidean_recall |
|
value: 70.31662269129288 |
|
- type: manhattan_accuracy |
|
value: 84.57411933003517 |
|
- type: manhattan_ap |
|
value: 68.3530821550187 |
|
- type: manhattan_f1 |
|
value: 64.74820143884892 |
|
- type: manhattan_precision |
|
value: 61.09550561797753 |
|
- type: manhattan_recall |
|
value: 68.86543535620054 |
|
- type: max_accuracy |
|
value: 85.05692316862371 |
|
- type: max_ap |
|
value: 70.54785019750204 |
|
- type: max_f1 |
|
value: 65.99060103883255 |
|
- task: |
|
type: PairClassification |
|
dataset: |
|
type: mteb/twitterurlcorpus-pairclassification |
|
name: MTEB TwitterURLCorpus |
|
config: default |
|
split: test |
|
revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf |
|
metrics: |
|
- type: cos_sim_accuracy |
|
value: 88.77440136608841 |
|
- type: cos_sim_ap |
|
value: 85.6224854550336 |
|
- type: cos_sim_f1 |
|
value: 77.76333865518139 |
|
- type: cos_sim_precision |
|
value: 75.09501613481535 |
|
- type: cos_sim_recall |
|
value: 80.6282722513089 |
|
- type: dot_accuracy |
|
value: 79.73570846431483 |
|
- type: dot_ap |
|
value: 59.509855217305315 |
|
- type: dot_f1 |
|
value: 57.20318336852364 |
|
- type: dot_precision |
|
value: 49.474630555711634 |
|
- type: dot_recall |
|
value: 67.79334770557438 |
|
- type: euclidean_accuracy |
|
value: 87.06096945705748 |
|
- type: euclidean_ap |
|
value: 81.65241378370953 |
|
- type: euclidean_f1 |
|
value: 73.29885784441386 |
|
- type: euclidean_precision |
|
value: 70.91642070405298 |
|
- type: euclidean_recall |
|
value: 75.8469356328919 |
|
- type: manhattan_accuracy |
|
value: 86.973648465091 |
|
- type: manhattan_ap |
|
value: 81.57560533116907 |
|
- type: manhattan_f1 |
|
value: 73.2408287397833 |
|
- type: manhattan_precision |
|
value: 72.33611173687767 |
|
- type: manhattan_recall |
|
value: 74.16846319679703 |
|
- type: max_accuracy |
|
value: 88.77440136608841 |
|
- type: max_ap |
|
value: 85.6224854550336 |
|
- type: max_f1 |
|
value: 77.76333865518139 |
|
--- |
|
<h1 align="center">GIST Embedding v0 - all-MiniLM-L6-v2</h1> |
|
|
|
*GIST Embedding: Guided In-sample Selection of Training Negatives for Text Embedding* |
|
|
|
The model is fine-tuned on top of the [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) using the [MEDI dataset](https://github.com/xlang-ai/instructor-embedding.git) augmented with mined triplets from the [MTEB Classification](https://huggingface.co/mteb) training dataset (excluding data from the Amazon Polarity Classification task). |
|
|
|
The model does not require any instruction for generating embeddings. This means that queries for retrieval tasks can be directly encoded without crafting instructions. |
|
|
|
Technical details of the model will be published shortly. |
|
|
|
# Data |
|
|
|
The dataset used is a compilation of the MEDI dataset and the MTEB Classification training dataset. Third-party datasets may be subject to additional terms and conditions under their associated licenses. A HuggingFace Dataset version of the compiled dataset, and the specific revision used to train the model, is available: |
|
|
|
- Dataset: [avsolatorio/medi-data-mteb_avs_triplets](https://huggingface.co/datasets/avsolatorio/medi-data-mteb_avs_triplets) |
|
- Revision: 238a0499b6e6b690cc64ea56fde8461daa8341bb |
|
|
|
The dataset contains a `task_type` key which can be used to select only the mteb classification tasks (prefixed with `mteb_`). |
|
|
|
The **MEDI Dataset** is published in the following paper: [One Embedder, Any Task: Instruction-Finetuned Text Embeddings](https://arxiv.org/abs/2212.09741). |
|
|
|
The MTEB Benchmark results of the GIST embedding model, compared with the base model, suggest that the fine-tuning dataset has perturbed the model considerably, which resulted in significant improvements in certain tasks while adversely degrading performance in some. |
|
|
|
The retrieval performance for the TRECCOVID task is of note. The fine-tuning dataset does not contain significant knowledge about COVID, which could have caused the observed performance degradation. Further work is currently being undertaken to validate this hypothesis. |
|
|
|
# Usage |
|
|
|
The model can be easily loaded using the Sentence Transformers library. |
|
|
|
```Python |
|
import torch.nn.functional as F |
|
from sentence_transformers import SentenceTransformer |
|
|
|
revision = None # Replace with the specific revision to ensure reproducibility in case the model is updated. |
|
|
|
model = SentenceTransformer("avsolatorio/GIST-all-MiniLM-L6-v2", revision=revision) |
|
|
|
texts = [ |
|
"Illustration of the REaLTabFormer model. The left block shows the non-relational tabular data model using GPT-2 with a causal LM head. In contrast, the right block shows how a relational dataset's child table is modeled using a sequence-to-sequence (Seq2Seq) model. The Seq2Seq model uses the observations in the parent table to condition the generation of the observations in the child table. The trained GPT-2 model on the parent table, with weights frozen, is also used as the encoder in the Seq2Seq model.", |
|
"Predicting human mobility holds significant practical value, with applications ranging from enhancing disaster risk planning to simulating epidemic spread. In this paper, we present the GeoFormer, a decoder-only transformer model adapted from the GPT architecture to forecast human mobility.", |
|
"As the economies of Southeast Asia continue adopting digital technologies, policy makers increasingly ask how to prepare the workforce for emerging labor demands. However, little is known about the skills that workers need to adapt to these changes" |
|
] |
|
|
|
# Compute embeddings |
|
embeddings = model.encode(texts, convert_to_tensor=True) |
|
|
|
# Compute cosine-similarity for each pair of sentences |
|
scores = F.cosine_similarity(embeddings.unsqueeze(1), embeddings.unsqueeze(0), dim=-1) |
|
|
|
print(scores.cpu().numpy()) |
|
``` |
|
|
|
# Training Parameters |
|
|
|
Below are the training parameters used to fine-tune the model: |
|
|
|
``` |
|
Epochs = 40 |
|
Warmup ratio = 0.1 |
|
Learning rate = 5e-6 |
|
Batch size = 16 |
|
Checkpoint step = 102000 |
|
Contrastive loss temperature = 0.01 |
|
``` |
|
|
|
Specific training details and strategies will be published shortly. |
|
|
|
# Evaluation |
|
|
|
The model was evaluated using the [MTEB Evaluation](https://huggingface.co/mteb) suite. |
|
|
|
|
|
# Acknowledgements |
|
|
|
This work is supported by the "KCP IV - Exploring Data Use in the Development Economics Literature using Large Language Models (AI and LLMs)" project funded by the [Knowledge for Change Program (KCP)](https://www.worldbank.org/en/programs/knowledge-for-change) of the World Bank - RA-P503405-RESE-TF0C3444. |
|
|
|
The findings, interpretations, and conclusions expressed in this material are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. |