Missing pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack

#2
by RainmakerP - opened

OSError: astronomer-io/Llama-3-8B-Instruct-GPTQ-8-Bit does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.

Any idea what's this?

Astronomer org
edited Apr 20

How are you loading the model? Can you provide reproducible steps or code you used so I can help debug this?

The model weights are produced in safetensors. The older pytorch_model.bin I can produce but it is not recommended by the industry.

This is the explanation from official HuggingFace website on why .safetensors is better than the pickled bin files.

What is safetensors ?

safetensors is a different format from the classic .bin which uses Pytorch which uses pickle. It contains the exact same data, which is just the model weights (or tensors).

Pickle is notoriously unsafe which allow any malicious file to execute arbitrary code. The hub itself tries to prevent issues from it, but it’s not a silver bullet.

safetensors first and foremost goal is to make loading machine learning models safe in the sense that no takeover of your computer can be done.

Astronomer org
edited Apr 20

If you are using text-generation-webui, please check the model card readme file on how to load and use it correctly.

Please note: through my testing TGI and vLLM has best throughput and token generation speed. I would recommend vLLM if your hardware is able to run it.

Astronomer org
edited Apr 22

Hey @RainmakerP , is your issue resolved? I renamed the .safetensors file to model.safetensors. This should fix the issue where some frameworks cannot find the model file for loading. Please let me know if this fixes your issue.

Side note: I highly recommend serving this model using vLLM as it is the most stable framework for Llam 3 GPTQ quants in my testing.

davidxmle changed discussion status to closed

Sign up or log in to comment