Why is each generated response so short / cut off / not complete?

#7
by Krisolada - opened

Please see the attached image:

We hope that each response can be longer and complete. We have to keep clicking on "Compute" button when testing to get more generated text back.

image.png

We wonder if this has to do with our host instance size:

Nvidia A10G
1x GPU · 24 GB
6 vCPU · 28 GB
1.3/hr

Thanks! Please help!

Please see the attached image:

We hope that each response can be longer and complete. We have to keep clicking on "Compute" button when testing to get more generated text back.

image.png

We wonder if this has to do with our host instance size:

Nvidia A10G
1x GPU · 24 GB
6 vCPU · 28 GB
1.3/hr

Thanks! Please help!

What is the code you are using or where are you using it?

Please see the attached image:

We hope that each response can be longer and complete. We have to keep clicking on "Compute" button when testing to get more generated text back.

image.png

We wonder if this has to do with our host instance size:

Nvidia A10G
1x GPU · 24 GB
6 vCPU · 28 GB
1.3/hr

Thanks! Please help!

What is the code you are using or where are you using it?

Hi, thanks for getting back to us!

We are simply testing at this stage using POSTMAN:

We pass in the token, set Content-Type: application/json, and send the example question '{"inputs": "What is the philosopher'''s stone, really?"}' via the request body.

The text generated in the screenshot above is all we get.

Hope the information helps!

Hi, thanks for getting back to us!

We are simply testing at this stage using POSTMAN:

We pass in the token, set Content-Type: application/json, and send the example question '{"inputs": "What is the philosopher'''s stone, really?"}' via the request body.

The text generated in the screenshot above is all we get.

Hope the information helps!

No I mean, are you using HF Transformers, TGI, VLLM, Llama.cpp, or what inference engine. It sounds like you are using some API based serving of the model, but not what that backend is

Sign up or log in to comment