Spaces:
Running
on
CPU Upgrade
mSimCSE
I wonder how these embeddings compare.
https://huggingface.co/yaushian/mSimCSE
https://github.com/yaushian/mSimCSE
Yeah would be cool to know, feel free to benchmark them - We can automatically add the scores to the leaderboard once we have the result files.
Maybe
@yaushian
is also interested in benchmarking :)
:)
I just discovered them whilst scouting, I don't yet know how to run the benchmarks.
https://github.com/YJiangcm/PromCSE by
@YuxinJiang
Is suggested to be a new English SOTA model, should be interesting to compare, too.
Nice find! Yeah would be great to have them!
Here's a simple script for running: https://github.com/embeddings-benchmark/mteb/blob/main/scripts/run_mteb_english.py & instructions for adding to the LB are here: https://github.com/embeddings-benchmark/mteb#leaderboard
For SimCSE-like models, probably something like the wrapper here needs to be used: https://github.com/embeddings-benchmark/mtebscripts/blob/9f82086299d939900d1bedfe6c5551efae2145ce/run_array_simcse.py#L109
Running it for mSimCSE-mono. I would mostly be interested in multilingual tests, there are some in AmazonCounterfactualClassification. So far I only got English results, does it skip multilingual tasks? I peeked at the classification results, English data. It's mixed, sometimes it is the best option among the multilingual ones, other times, not, but never bad. Not necessarily sota in the English classification category (multilingual embeddings subset), but not bad, very competitive. Not far from it I'd guess, if English results reflect what one can expect in multilingual settings, which is the key point of their paper: mSimCSE-mono with xlm-roberta was trained on English data only, great results in other languages, usually outperforming cross lingual trained version.
How can I run only multilingual classification tests tests?
The comprehensive benchmarks take a long time to run, I guess (i.e. clustering). Not sure I can let them finish today.
Can I submit partial results, too? Or maybe I run a set of tasks once a weekend over a several weeks.
You can specify languages as follows: evaluation = MTEB(tasks=["AmazonCounterfactualClassification"], task_langs=["en"])
- Here it will only run English. If you leave task_langs
empty it will run all languages by default. You can check the available languages e.g. here for AmazonCounterfactualClassification.
Yes you can submit partial results like many other models in the benchmark currently are only partial!
Just add them to the metadata of the model you ran. You probably need to open a PR on https://huggingface.co/yaushian/mSimCSE/discussions and then
@yaushian
needs to merge it. If
@yaushian
is not available to merge it, you can also just copy the model to your account and add the metadata there.