Issue with inference endpoints

#122
by simon5454 - opened

Hi,

I tried to use the deploy the "openai/whisper-large-v3" model with inference endpoints. I made the request from postman (importing the curl command).
It works for the inference API endpoint, but not with the allocate inference endpoint.
I just receive a 500 server error.

Configuration is set to task: automatic-speech-recognition and container type: default.
Model: openai/whisper-large-v3
Instance: AWS us-east-1 GPU Nvidia T4 1xGPU 16 GB.

Can you help me understand what I am doing wrong?

Sign up or log in to comment