Can't host with vLLM - "LlamaModel" architecture is not supported.
This model was fine tuned on codellama/CodeLlama-70b-hf, which has this as the architecture in it's config.json:
"LlamaForCausalLM"
The config.json for this model has this as the architecture:
"LlamaModel"
This causes an error when hosting with vLLM, as it only supports LlamaForCausalLM. I tried changing the config.json for sqlcoder-70b-alpha to indicate "LlamaForCausalLM", but this gave a KeyError:
KeyError: 'layers.11.input_layernorm.weight'
Is this a bug, or is this intentionally a different architecture? If it's not a bug, it seems vLLM cannot host this model, even though it supports LlamaForCausalLM from CodeLlama. Is there a recommended way, through something like vLLM or TGI...etc, to host this model?
Hi there, we discovered a bizzare bug where the model's lm_head.weight
was not uploaded to HF in the upload process. This is causing many integrations to break, and the model uploaded here is producing gibberish results.
Fix coming soon – hopefully in the next hour
Fixed with a reupload of the model weights! Apologies for the issue. You'll unfortunately have to re-download the model weights first (run rm ~/.cache/huggingface/hub/models--defog--sqlcoder-70b-alpha
). Should work great after that