Why is the inference api unavailable?

by amgadhasan - opened Jun 12, 2023

Discussion

amgadhasan

Jun 12, 2023

The inference api for this model is unavailable. It would be nice if we could try the model out on the fly.

Thanks for the model btw

TheBloke

Owner Jun 12, 2023

It's unavailable because this is a quantised GPTQ model and Hugging Face doesn't support live inference on those.

But I have an unquantised version uploaded as well, and inference is enabled on this: https://huggingface.co/TheBloke/wizard-vicuna-13B-HF

Personally I've never found that live inference demo to be any good though, because it only returns about 10 tokens which isn't nearly enough to test anything useful. But it's available at the above link if you want it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment