Add Sentence Transformers support

#2
by tomaarsen HF staff - opened

Hello!

Preface

First of all, congratulations on this release! I will be updating the MTEB leaderboard shortly, which should place this model above bge-large-en-v1.5 as you mentioned in your paper as well.
I'm looking forward to delving deeper into your paper soon :)

Pull Request overview

  • Add Sentence Transformers support

Details

Adding support for Sentence Transformers is fairly simple with this model: it's mostly configuring the Pooling & adding a Normalization module, after which the snippet from the README should work well. This support should also allow this model to be more easily used in third-party implementations like LangChain.

Sidenote

In the near future I will be updating Sentence Transformers to add prompt templating via configuration. Then it will be possible to add the prompts directly in the config_sentence_transformers.json file, e.g.:

{
    ...
    "prompts": {
        "web_search_query": "Instruct: Given a web search query, retrieve relevant passages that answer the query\nQuery: {}",
        "...",
    },
    "default_prompt_name": null,
}

Users can then just use model.encode(my_queries, prompt_name="web_search_query"). Once I move forward with this update, then I will make a PR for this model to add some prompts to the config.

  • Tom Aarsen
tomaarsen changed pull request status to open

That's amazing! Thanks for your contribution!

intfloat changed pull request status to merged

Sign up or log in to comment