Spaces:

mteb
/

leaderboard

Running on CPU Upgrade

App Files Files Community

118

Adding w601sxs/b1ade-embed to the leaderboard

#114

by w601sxs - opened 17 days ago

Discussion

w601sxs

17 days ago

Could you please refresh and add https://huggingface.co/w601sxs/b1ade-embed to the leaderboard?

w601sxs

17 days ago

@tomaarsen are you able to help with this and some other models down the discussion board with a refresh? thanks.

tomaarsen

Massive Text Embedding Benchmark org 17 days ago

Restarting it now! I do notice that you're missing results for MindSmallReranking, which means that your entry is not fully complete. As a result, it can't compute an Average for your model.
Also, we heavily recommend models to share some details about how they're created in the model card. Also, consider adding Sentence Transformers support for the model. Should be as simple as:

from sentence_transformers import SentenceTransformer
from sentence_transformers.models import Transformer, Pooling

transformer = Transformer("w601sxs/b1ade-embed")
pooling = Pooling(transformer.get_word_embedding_dimension(), pooling_mode="mean") # If you use "mean" pooling, otherwise e.g. "cls" or whatever you're using
model = SentenceTransformer(modules=[transformer, pooling])
# save the model locally:
model.save_pretrained("w601sxs_b1ade-embed")
# or push it to HF directly:
# model.push_to_hub("w601sxs/b1ade-embed", exist_ok=True)

Tom Aarsen

w601sxs

17 days ago

Yes, the mteb script failed to download the MindSmallReranking one:

INFO:mteb.evaluation.MTEB:

********************** Evaluating MindSmallReranking **********************
INFO:mteb.evaluation.MTEB:Loading dataset for MindSmallReranking
Repo card metadata block was not found. Setting CardData to empty.
WARNING:huggingface_hub.repocard:Repo card metadata block was not found. Setting CardData to empty.
Failed to read file 'gzip://7a742da40ba0425a72301598ce27d63296c468da48cd98c4ae479b1d88a755a8::/root/.cache/huggingface/datasets/downloads/7a742da40ba0425a72301598ce27d63296c468da48cd98c4ae479b1d88a755a8' with error <class 'pyarrow.lib.ArrowInvalid'>: JSON parse error: Invalid value. in row 0
ERROR:datasets.packaged_modules.json.json:Failed to read file 'gzip://7a742da40ba0425a72301598ce27d63296c468da48cd98c4ae479b1d88a755a8::/root/.cache/huggingface/datasets/downloads/7a742da40ba0425a72301598ce27d63296c468da48cd98c4ae479b1d88a755a8' with error <class 'pyarrow.lib.ArrowInvalid'>: JSON parse error: Invalid value. in row 0
ERROR:mteb.evaluation.MTEB:Error while evaluating MindSmallReranking: An error occurred while generating the dataset

tomaarsen

Massive Text Embedding Benchmark org 17 days ago

Huh, I've seen this report before, but I believe I was never able to replicate it. @Muennighoff do you have the time to look into this?

Tom Aarsen

w601sxs

17 days ago

Added issue here - https://github.com/embeddings-benchmark/mteb/issues/747

w601sxs

17 days ago

•

edited 17 days ago

I had to downgrade datasets and it worked.

Uploaded a new readme with MindSmallReranking .. Will add sentence transformers too.

w601sxs

13 days ago

Can you refresh again @tomaarsen ? Or maybe post when this will be refreshed next.. what I saw is that it is ~ once a week?

tomaarsen

Massive Text Embedding Benchmark org 13 days ago

I've restarted it :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment