Support for embedding endpoint

#13

by ultraxyz - opened Jun 7, 2024

Jun 7, 2024

I deployed the model using vLLM, and use the following code from https://docs.vllm.ai/en/latest/getting_started/examples/openai_embedding_client.html but got 404 error:

from openai import OpenAI

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

client = OpenAI(
    # defaults to os.environ.get("OPENAI_API_KEY")
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id

responses = client.embeddings.create(input=[
    "Hello my name is",
    "The best thing about vLLM is that it supports many different models"
], model=model)

for data in responses.data:
    print(data.embedding)  # list of float of len 4096

Error message:

NotFoundError: Error code: 404 - {'detail': 'Not Found'}

haijian06

Jun 25, 2024

Hi, you asked about the openai Embedding model, you can refer more to the official website of openai for this question. But it looks like you are also interested in our Yi-1.5-34B-Chat model, I can give you an example of reasoning about this model using vllm and you can try it.

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

tokenizer = AutoTokenizer.from_pretrained("01-ai/Yi-1.5-34B-Chat")
sampling_params = SamplingParams(
    temperature=0.8,
    top_p=0.8)
llm = LLM(model="01-ai/Yi-1.5-34B-Chat")
prompt = "Hi!"
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
outputs = llm.generate([text], sampling_params)

for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

lorinma

01-ai org Jun 25, 2024

hi ultraxyz,

Yi-1.5 is not an embedding model. An embedding model takes text as input and output is a list of float numbers with a length of 4096, usually belongs to the BERT category and needs to load with sentence-transformers.

lorinma changed discussion status to closed Oct 3, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment