Edit model card

This model is compiled for neuronx devices (eg. on inf2 instance).

This original checkpoint is BAAI/bge-base-en-v1.5.

Export

Here below is the command used for exporting this model:

optimum-cli export neuron -m BAAI/bge-base-en-v1.5 --sequence_length 384 --batch_size 1 --task feature-extraction bge_emb/

Usage

To use the compiled artifacts for inference, here is an example:

from transformers import AutoTokenizer
from optimum.neuron import NeuronModelForSenetenceTransformers

emb_model = NeuronModelForSenetenceTransformers.from_pretrained("optimum/bge-base-en-v1.5-neuronx")
inputs = tokenizer("Hamilton is considered to be the best musical of human history.", return_tensors="pt")
emb = emb_model(**inputs)

# ["token_embeddings", "sentence_embedding"]
Downloads last month
3
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.