always getting 0 in output

by xubuild - opened Jan 10

Jan 10

the model always respond with token id 0 for any input, while same prompt can get correct response from https://huggingface.co/casperhansen/mixtral-instruct-awq

tested with vllm
llm = LLM(model=model_path, quantization="awq", trust_remote_code=True, dtype="auto", enforce_eager=True, max_model_len=12288)

sd3ntato

Jan 11

same for me!

hassantrixly

Jan 12

Is there any fix for that?

hnhlester

Jan 12

Also happening for me. Seems to be an issue with this model. Other 8x7b awq models work perfectly fine, for example dolphin-2.7-8x7b.

brendanlui

Jan 15

@TheBloke Any update about this issue? Always get an empty output as well...

FYI,

python3 -m vllm.entrypoints.openai.api_server --model "$MODEL_NAME" --host 0.0.0.0 --port 8181 --quantization awq --dtype auto

    "choices": [
        {
            "index": 0,
            "text": "",
            "logprobs": null,
            "finish_reason": "length"
        }
    ],

Bourhano

Jan 15

•

edited Jan 15

same, maybe a prompt template issue?

Fouldon

Jan 15

Any update ?

dafen

Jan 17

俺也一样

hiranya911

Jan 17

Also seeing this in testing. Our vLLM setup:

sampling_params = SamplingParams(temperature=0.1, top_p=0.95)
model = 'TheBloke/Mixtral-8x7B-Instruct-v0.1-AWQ'

llm = LLM(
  model=model, 
  gpu_memory_utilization=0.7, 
  max_model_len=2048,
)