gtr-t5-xl / README.md
tomaarsen's picture
tomaarsen HF staff
Update metadata manually
8adc7ec verified
metadata
language: en
license: apache-2.0
library_name: sentence-transformers
tags:
  - sentence-transformers
  - feature-extraction
  - sentence-similarity
pipeline_tag: sentence-similarity

sentence-transformers/gtr-t5-xl

This is a sentence-transformers model: It maps sentences & paragraphs to a 768 dimensional dense vector space. The model was specifically trained for the task of sematic search.

This model was converted from the Tensorflow model gtr-xl-1 to PyTorch. When using this model, have a look at the publication: Large Dual Encoders Are Generalizable Retrievers. The tfhub model and this PyTorch model can produce slightly different embeddings, however, when run on the same benchmarks, they produce identical results.

The model uses only the encoder from a T5-3B model. The weights are stored in FP16.

Usage (Sentence-Transformers)

Using this model becomes easy when you have sentence-transformers installed:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('sentence-transformers/gtr-t5-xl')
embeddings = model.encode(sentences)
print(embeddings)

The model requires sentence-transformers version 2.2.0 or newer.

Evaluation Results

For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net

Citing & Authors

If you find this model helpful, please cite the respective publication: Large Dual Encoders Are Generalizable Retrievers