codesage/codesage-small · Add Sentence Transformers support

tomaarsen

Feb 13, 2024

•

edited Feb 13, 2024

Hello!

Pull Request overview

Add Sentence Transformers support.

Details

With this PR, it will become possible to load this model with Sentence Transformers without triggering any warnings. The embeddings match that of the transformers approach, but the Sentence Transformers project simplifies some of the steps for the end user (e.g. no need to worry about tokenization, output shapes, etc.).

If you want to experiment with this PR before merging it, then you can use the revision option like so:

from sentence_transformers import SentenceTransformer

checkpoint = r"codesage/codesage-small"
device = "cuda"  # for GPU usage or "cpu" for CPU usage
revision = "refs/pr/3"

model = SentenceTransformer(checkpoint, device=device, trust_remote_code=True, revision=revision)

embedding = model.encode("def print_hello_world():\tprint('Hello World!')")
print(f'Dimension of the embedding: {embedding.size}')
# print(f'Dimension of the embedding: 1024')

after doing

pip install -U sentence-transformers

Tom Aarsen

Add Sentence Transformers supportbc0cccc5

tomaarsen changed pull request status to open Feb 13, 2024