Spaces:

clibrain
/

Spanish-Embeddings-Leaderboard

Runtime error

App Files Files Community

Santi Diana commited on Sep 25, 2023

Commit

3aae85b

•

1 Parent(s): b8e8c93

Uploaded state-of-the-art propietary models

Browse files

Files changed (9) hide show

.DS_Store +0 -0
add_new_model/.DS_Store +0 -0
add_new_model/README.md +3 -1
add_new_model/add_new_model.py +3 -4
add_new_model/mteb_metadata.yaml +114 -0
app.py +44 -3
data/classification.csv +17 -8
data/general.csv +9 -0
data/sts.csv +17 -8

.DS_Store CHANGED Viewed

Binary files a/.DS_Store and b/.DS_Store differ

add_new_model/.DS_Store ADDED Viewed

Binary file (6.15 kB). View file

add_new_model/README.md CHANGED Viewed

@@ -7,4 +7,6 @@ when evaluating `sentence-transformers/sentence-t5-large`.
 3. Once evaluated, move that folder to this folder, so it will be inside `add_new_model` folder.
 4. Execute the file `MTEB_metadata_to_yaml.py`. That will create a file named `mteb_medadata.yaml` that contains the metadata regarding your evaluation.
 5. Execute the file `add_new_model.py`. That file will add your model to the Leaderboard.
-6. Add, commit and `git push` the changes without uploading the results and the `mteb_metadata.yaml`.

 3. Once evaluated, move that folder to this folder, so it will be inside `add_new_model` folder.
 4. Execute the file `MTEB_metadata_to_yaml.py`. That will create a file named `mteb_medadata.yaml` that contains the metadata regarding your evaluation.
 5. Execute the file `add_new_model.py`. That file will add your model to the Leaderboard.
+6. Add, commit and `git push` the changes without uploading the results and the `mteb_metadata.yaml`.
+7. It is recommended to launch the app by running `python3 app.py` from parent folder, and confirm that there are no errors in the leaderboard and we
+are uploading it as we wanted.

add_new_model/add_new_model.py CHANGED Viewed

@@ -54,13 +54,12 @@ def add_model(metadata_archive):
     ## CLASSIFICATION
     classification_dataframe = pd.read_csv('../data/classification.csv')
     classification_df = df[df['Category']== 'Classification']
-    new_row_data = {'Model name': model_name}
     for index, row in classification_df.iterrows():
         column_name = row['dataset_name']
         accuracy_value = row['Accuracy']
         new_row_data[column_name] = round(accuracy_value,2)
     new_row_df = pd.DataFrame(new_row_data,index=[0])
     classification_dataframe = pd.concat([classification_dataframe,new_row_df],ignore_index=True)
     classification_dataframe.to_csv("../data/classification.csv",index=False)
@@ -68,7 +67,7 @@ def add_model(metadata_archive):
     ## STS
     sts_dataframe = pd.read_csv('../data/sts.csv')
     sts_df = df[df['Category']=='STS']
-    new_row_data = {'Model name': model_name}
     for index, row in sts_df.iterrows():
         column_name = row['dataset_name']

     ## CLASSIFICATION
     classification_dataframe = pd.read_csv('../data/classification.csv')
     classification_df = df[df['Category']== 'Classification']
+    new_row_data = {'Model name': model_name, 'Average': classification_average}
     for index, row in classification_df.iterrows():
         column_name = row['dataset_name']
         accuracy_value = row['Accuracy']
         new_row_data[column_name] = round(accuracy_value,2)
     new_row_df = pd.DataFrame(new_row_data,index=[0])
     classification_dataframe = pd.concat([classification_dataframe,new_row_df],ignore_index=True)
     classification_dataframe.to_csv("../data/classification.csv",index=False)
     ## STS
     sts_dataframe = pd.read_csv('../data/sts.csv')
     sts_df = df[df['Category']=='STS']
+    new_row_data = {'Model name': model_name, 'Average': sts_spearman_average}
     for index, row in sts_df.iterrows():
         column_name = row['dataset_name']

add_new_model/mteb_metadata.yaml ADDED Viewed

	@@ -0,0 +1,114 @@

+---
+tags:
+- mteb
+model-index:
+- name: multilingual-e5-large-stsb-tuned-b64-e10
+  results:
+  - task:
+      type: Classification
+    dataset:
+      type: mteb/amazon_reviews_multi
+      name: MTEB AmazonReviewsClassification (es)
+      config: es
+      split: test
+      revision: 1399c76144fd37290681b995c656ef9b2e06e26d
+    metrics:
+    - type: accuracy
+      value: 43.709999999999994
+    - type: f1
+      value: 41.47169623212768
+  - task:
+      type: Classification
+    dataset:
+      type: mteb/mtop_domain
+      name: MTEB MTOPDomainClassification (es)
+      config: es
+      split: test
+      revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
+    metrics:
+    - type: accuracy
+      value: 88.83589059372916
+    - type: f1
+      value: 88.28914595398294
+  - task:
+      type: Classification
+    dataset:
+      type: mteb/mtop_intent
+      name: MTEB MTOPIntentClassification (es)
+      config: es
+      split: test
+      revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
+    metrics:
+    - type: accuracy
+      value: 60.20346897931954
+    - type: f1
+      value: 41.64439175677159
+  - task:
+      type: Classification
+    dataset:
+      type: mteb/amazon_massive_intent
+      name: MTEB MassiveIntentClassification (es)
+      config: es
+      split: test
+      revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
+    metrics:
+    - type: accuracy
+      value: 62.74041694687289
+    - type: f1
+      value: 61.77713703269475
+  - task:
+      type: Classification
+    dataset:
+      type: mteb/amazon_massive_scenario
+      name: MTEB MassiveScenarioClassification (es)
+      config: es
+      split: test
+      revision: 7d571f92784cd94a019292a1f45445077d0ef634
+    metrics:
+    - type: accuracy
+      value: 67.40080699394755
+    - type: f1
+      value: 67.14214912345791
+  - task:
+      type: STS
+    dataset:
+      type: mteb/sts17-crosslingual-sts
+      name: MTEB STS17 (es-es)
+      config: es-es
+      split: test
+      revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
+    metrics:
+    - type: cos_sim_pearson
+      value: 88.26778066226262
+    - type: cos_sim_spearman
+      value: 88.03435803600337
+    - type: euclidean_pearson
+      value: 88.31560142002508
+    - type: euclidean_spearman
+      value: 88.03594258414384
+    - type: manhattan_pearson
+      value: 88.3997621988469
+    - type: manhattan_spearman
+      value: 88.17114024743894
+  - task:
+      type: STS
+    dataset:
+      type: mteb/sts22-crosslingual-sts
+      name: MTEB STS22 (es)
+      config: es
+      split: test
+      revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
+    metrics:
+    - type: cos_sim_pearson
+      value: 66.49699923699941
+    - type: cos_sim_spearman
+      value: 70.12135103690638
+    - type: euclidean_pearson
+      value: 67.63308096173844
+    - type: euclidean_spearman
+      value: 70.12135103690638
+    - type: manhattan_pearson
+      value: 67.49091236728717
+    - type: manhattan_spearman
+      value: 70.08015881466724
+---

app.py CHANGED Viewed

@@ -1,10 +1,11 @@
 import gradio as gr
 import pandas as pd
-block = gr.Blocks()
 NUM_DATASETS = 7
 NUM_SCORES = 0
-NUM_MODELS = 5
 def general_dataframe_update():
     """
@@ -19,6 +20,7 @@ def classification_dataframe_update():
     """
     dataframe = pd.read_csv('data/classification.csv')
     return dataframe
 def sts_dataframe_udpate():
     """
     Returns sts dataframe for sts table.
@@ -26,6 +28,13 @@ def sts_dataframe_udpate():
     dataframe = pd.read_csv('data/sts.csv')
     return dataframe
 with block:
     gr.Markdown(f"""**Leaderboard de modelos de Embeddings en español
     Massive Text Embedding Benchmark (MTEB) Leaderboard.**
@@ -40,7 +49,7 @@ with block:
                     gr.Markdown("""
                     **Tabla General de Embeddings**
-                    - **Metricas:** Varias, con sus respectivas medias.
                     - **Idioma:** Español
                     """)
             with gr.Row():
@@ -51,6 +60,13 @@ with block:
                         wrap=True,
                     )
         with gr.TabItem("Classification"):
             with gr.Row():
                 # Create and display a sample DataFrame
                 classification = classification_dataframe_update()
@@ -60,6 +76,13 @@ with block:
                         wrap=True,
                     )
         with gr.TabItem("STS"):
             with gr.Row():
                 # Create and display a sample DataFrame
                 sts = sts_dataframe_udpate()
@@ -68,6 +91,24 @@ with block:
                         type="pandas",
                         wrap=True,
                     )
 block.launch()

 import gradio as gr
 import pandas as pd
+dataframe = pd.read_csv('data/general.csv')
 NUM_DATASETS = 7
 NUM_SCORES = 0
+NUM_MODELS = len(dataframe)
 def general_dataframe_update():
     """
     """
     dataframe = pd.read_csv('data/classification.csv')
     return dataframe
 def sts_dataframe_udpate():
     """
     Returns sts dataframe for sts table.
     dataframe = pd.read_csv('data/sts.csv')
     return dataframe
+def clustering_dataframe_update():
+    pass
+def retrieval_dataframe_update():
+    pass
+block = gr.Blocks()
 with block:
     gr.Markdown(f"""**Leaderboard de modelos de Embeddings en español
     Massive Text Embedding Benchmark (MTEB) Leaderboard.**
                     gr.Markdown("""
                     **Tabla General de Embeddings**
+                    - **Métricas:** Varias, con sus respectivas medias.
                     - **Idioma:** Español
                     """)
             with gr.Row():
                         wrap=True,
                     )
         with gr.TabItem("Classification"):
+            with gr.Row():
+                    gr.Markdown("""
+                    **Tabla Classification de Embeddings**
+                    - **Métricas:** Spearman correlation based on cosine similarity.
+                    - **Idioma:** Español
+                    """)
             with gr.Row():
                 # Create and display a sample DataFrame
                 classification = classification_dataframe_update()
                         wrap=True,
                     )
         with gr.TabItem("STS"):
+            with gr.Row():
+                    gr.Markdown("""
+                    **Tabla Classification de Embeddings**
+                    - **Metricas:** .
+                    - **Idioma:** Español
+                    """)
             with gr.Row():
                 # Create and display a sample DataFrame
                 sts = sts_dataframe_udpate()
                         type="pandas",
                         wrap=True,
                     )
+        with gr.TabItem("Clustering"):
+            with gr.Row():
+                # Create and display a sample DataFrame
+                sts = clustering_dataframe_update()
+                data_overall = gr.components.Dataframe(
+                        sts,
+                        type="pandas",
+                        wrap=True,
+                    )
+        with gr.TabItem("Retrieval"):
+            with gr.Row():
+                # Create and display a sample DataFrame
+                sts = retrieval_dataframe_update()
+                data_overall = gr.components.Dataframe(
+                        sts,
+                        type="pandas",
+                        wrap=True,
+                    )
 block.launch()

data/classification.csv CHANGED Viewed

@@ -1,8 +1,17 @@
-Model name,MTEB AmazonReviewsClassification (es),MTEB MTOPDomainClassification (es),MTEB MTOPIntentClassification (es),MTEB MassiveIntentClassification (es),MTEB MassiveScenarioClassification (es)
-multilingual-e5-large,42.66,89.95,66.84,64.68,68.85
-bge-small-en-v1.5,32.03,76.93,52.15,48.77,54.42
-multilingual-e5-base,42.47,89.62,60.27,60.51,66.52
-multilingual-e5-small,41.3,87.33,55.87,58.06,63.1
-paraphrase-multilingual-mpnet-base-v2,39.99,86.96,66.59,64.43,70.42
-sentence-t5-large,42.89,80.78,52.07,54.1,59.56
-sentence-t5-xl,45.01,85.32,57.38,57.97,62.52

+Model name,Average,MTEB AmazonReviewsClassification (es),MTEB MTOPDomainClassification (es),MTEB MTOPIntentClassification (es),MTEB MassiveIntentClassification (es),MTEB MassiveScenarioClassification (es)
+multilingual-e5-large,66.59,42.66,89.95,66.84,64.68,68.85
+bge-small-en-v1.5,52.86,32.03,76.93,52.15,48.77,54.42
+multilingual-e5-base,63.87,42.47,89.62,60.27,60.51,66.52
+multilingual-e5-small,61.13,41.3,87.33,55.87,58.06,63.1
+paraphrase-multilingual-mpnet-base-v2,65.67,39.99,86.96,66.59,64.43,70.42
+sentence-t5-large,57.87,42.89,80.78,52.07,54.1,59.56
+sentence-t5-xl,61.64,45.01,85.32,57.38,57.97,62.52
+paraphrase-spanish-distilroberta,63.98,38.24,86.81,65.94,60.52,68.39
+sentence_similarity_spanish_es,61.77,35.08,85.86,65.21,58.51,64.21
+paraphrase-multilingual-mpnet-base-v2-ft-stsb_multi_mt-embeddings,64.0,37.25,86.93,66.28,62.6,66.96
+mstsb-paraphrase-multilingual-mpnet-base-v2,64.47,38.29,86.04,67.06,63.47,67.53
+multilingual-e5-base-b16-e10,65.09,43.4,89.02,61.7,63.06,68.25
+multilingual-e5-large-stsb-tuned-b32-e10,66.19,43.31,89.3,64.04,64.62,69.69
+multilingual-e5-large-stsb-tuned-b16-e10,67.1,43.72,90.29,65.51,65.13,70.84
+multilingual-e5-large-stsb-tuned,66.23,43.62,89.33,62.93,65.11,70.16
+multilingual-e5-large-stsb-tuned-b64-e10,64.58,43.71,88.84,60.2,62.74,67.4

data/general.csv CHANGED Viewed

@@ -6,3 +6,12 @@ multilingual-e5-small,,,68.64,61.13,,76.15,
 paraphrase-multilingual-mpnet-base-v2,,,69.1,65.68,,72.53,
 sentence-t5-large,,,64.04,57.88,,70.21,
 sentence-t5-xl,,,66.22,61.64,,70.79,

 paraphrase-multilingual-mpnet-base-v2,,,69.1,65.68,,72.53,
 sentence-t5-large,,,64.04,57.88,,70.21,
 sentence-t5-xl,,,66.22,61.64,,70.79,
+paraphrase-spanish-distilroberta,,,69.34,63.98,,74.7,
+sentence_similarity_spanish_es,,,68.5,61.77,,75.22,
+paraphrase-multilingual-mpnet-base-v2-ft-stsb_multi_mt-embeddings,,,68.62,64.0,,73.25,
+mstsb-paraphrase-multilingual-mpnet-base-v2,,,69.39,64.48,,74.29,
+multilingual-e5-base-b16-e10,,,71.97,65.09,,78.86,
+multilingual-e5-large-stsb-tuned-b32-e10,,,72.73,66.19,,79.27,
+multilingual-e5-large-stsb-tuned-b16-e10,,,73.07,67.1,,79.05,
+multilingual-e5-large-stsb-tuned,,,72.84,66.23,,79.46,
+multilingual-e5-large-stsb-tuned-b64-e10,,,71.83,64.58,,79.08,

data/sts.csv CHANGED Viewed

@@ -1,8 +1,17 @@
-Model name,MTEB STS17 (es-es),MTEB STS22 (es)
-multilingual-e5-large,87.42,68.23
-bge-small-en-v1.5,77.73,55.47
-multilingual-e5-base,87.26,67.79
-multilingual-e5-small,85.27,67.04
-paraphrase-multilingual-mpnet-base-v2,85.14,59.91
-sentence-t5-large,82.74,57.68
-sentence-t5-xl,83.42,58.16

+Model name,Average,MTEB STS17 (es-es),MTEB STS22 (es)
+multilingual-e5-large,77.82,87.42,68.23
+bge-small-en-v1.5,66.6,77.73,55.47
+multilingual-e5-base,77.52,87.26,67.79
+multilingual-e5-small,76.15,85.27,67.04
+paraphrase-multilingual-mpnet-base-v2,72.52,85.14,59.91
+sentence-t5-large,70.21,82.74,57.68
+sentence-t5-xl,70.78,83.42,58.16
+paraphrase-spanish-distilroberta,74.7,85.79,63.61
+sentence_similarity_spanish_es,75.22,85.37,65.07
+paraphrase-multilingual-mpnet-base-v2-ft-stsb_multi_mt-embeddings,73.24,86.89,59.6
+mstsb-paraphrase-multilingual-mpnet-base-v2,74.28,88.22,60.36
+multilingual-e5-base-b16-e10,78.86,87.51,70.21
+multilingual-e5-large-stsb-tuned-b32-e10,79.27,88.1,70.44
+multilingual-e5-large-stsb-tuned-b16-e10,79.05,88.53,69.58
+multilingual-e5-large-stsb-tuned,79.46,88.44,70.48
+multilingual-e5-large-stsb-tuned-b64-e10,79.08,88.03,70.12