any idea how to test this for inferencing using vllm?

by silvacarl - opened Sep 27, 2023

Discussion

silvacarl

Sep 27, 2023

we tried every method we can think of, just keep getting error messaages saying AWS is not an option

jlzhou

Sep 28, 2023

Which vllm version are you using? This model is a safetensor model, vllm fixes awq safetensor support in this PR, which is not released yet.

TheBloke

Owner Sep 28, 2023

Yes, my recent AWQ readmes contain this extra info:

Note: at the time of writing, vLLM has not yet done a new release with support for the `quantization` parameter.

If you try the code below and get an error about `quantization` being unrecognised, please install vLLM from Github source.

silvacarl

Sep 28, 2023

got it, will do. thx!!!!!!!!!!!!

silvacarl changed discussion status to closed Sep 28, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment