Retrive embeddings from onnxruntime-genai-cuda

by dkjsnnr1 - opened

Hi There!

Is there any documentation of how to retrive the embeddings generated by the model Phi-3-mini-128k 4int with ONNX runtime using CUDA?

Microsoft org

I don't believe there is support for this currently but it is straightforward to add. Can you open a GitHub issue with more details on your scenario?

kvaishnavi changed discussion status to closed

Sign up or log in to comment