Edit model card

Neuronx model for BAAI/bge-base-en-v1.5

This repository contains are AWS Inferentia2 and neuronx compatible checkpoint for BAAI/bge-base-en-v1.5. You can find detailed information about the base model on its Model Card.

Usage on Amazon SageMaker

coming soon

Usage with optimum-neuron


from optimum.neuron import NeuronModelForFeatureExtraction
from transformers import AutoTokenizer
import torch
import torch_neuronx

# Load Model from Hugging Face repository
model = NeuronModelForFeatureExtraction.from_pretrained("aws-neuron/bge-base-en-v1-5-seqlen-384-bs-1")
tokenizer = AutoTokenizer.from_pretrained("aws-neuron/bge-base-en-v1-5-seqlen-384-bs-1")

# sentence input
inputs = "Hello, my dog is cute"

# Tokenize sentences
encoded_input = tokenizer(inputs,return_tensors="pt",truncation=True,max_length=model.config.neuron["static_sequence_length"])

# Compute embeddings
with torch.no_grad():
    model_output = model(*tuple(encoded_input.values()))

# Perform pooling. In this case, cls pooling.
sentence_embeddings = model_output[0][:, 0]
# normalize embeddings
sentence_embeddings = torch.nn.functional.normalize(sentence_embeddings, p=2, dim=1)   

input_shapes

{
  "sequence_length": 384,
  "batch_size": 1
}
Downloads last month
4,460

Evaluation results