Unable to load the model

#1
by anantharamb - opened

I used the steps (as specified in the documentation) below to load the model. But it complains of the following error: 'stanford-oval/Llama-2-7b-WikiChat does not appear to have a file named pytorch_model-00001-of-00002.bin. ' Appreciate any inputs on this

Load model directly

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("stanford-oval/Llama-2-7b-WikiChat")
model = AutoModelForCausalLM.from_pretrained("stanford-oval/Llama-2-7b-WikiChat")

Stanford Open Virtual Assistant Lab (OVAL) org

I tested this model with both HuggingFace's TGI (https://github.com/huggingface/text-generation-inference) and vLLM (https://github.com/vllm-project/vllm) and it works just fine.
I'm not sure why it doesn't work directly using transformers. We normally don't test with that because it is much slower at inference.

Stanford Open Virtual Assistant Lab (OVAL) org
edited Jan 14

OK, there seems to have been an issue when converting model weights to the .safetensors format. Apparently, TGI and vLLM don't rely on model.safetensors.index.json, but the transformers library does.
I've fixed both models and they should work with transformers now.

s-jse changed discussion status to closed

Sign up or log in to comment