sentence-transformers
/

all-MiniLM-L6-v2

I deployed this model with the entry-level Inference Endpoint and it takes 7 seconds to compute 35 embeddings, i.e. only 5 embeddings per second. Is this normal for this model or the endpoint is slower than it should be?

Maroofabdullah

Oct 30, 2023

normal

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment