Retrive embeddings from onnxruntime-genai-cuda

#6
by dkjsnnr1 - opened

Hi There!

Is there any documentation of how to retrive the embeddings generated by the model Phi-3-mini-128k 4int with ONNX runtime using CUDA?

Microsoft org

I don't believe there is support for this currently but it is straightforward to add. Can you open a GitHub issue with more details on your scenario?

Sign up or log in to comment