notstoic/pygmalion-13b-4bit-128g · Loading Model in HF Transformers

I am currently facing a challenge while trying to load a model stored in the safetensors format using the Transformers library.

Below is the code I am using:

from transformers import LlamaForCausalLM, LlamaTokenizer

tokenizer = LlamaTokenizer.from_pretrained("path/to/model")
model = LlamaForCausalLM.from_pretrained("path/to/model", use_safetensors=True)

Unfortunately, this results in the following error:

Traceback (most recent call last):
  File "/Users/maxhager/Projects2023/nsfw/model_run.py", line 4, in <module>
    model = LlamaForCausalLM.from_pretrained("path/to/model", use_safetensors=True)
  File "/Users/maxhager/.virtualenvs/nsfw/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2449, in from_pretrained
    raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory path/to/model.

In my model directory (path/to/model), I have the following files:

4bit-128g.safetensors
config.json
generation_config.json
pytorch_model.bin.index.json
special_tokens_map.json
tokenizer.json
tokenizer.model
tokenizer_config.json
I was under the impression that setting use_safetensors=True should instruct the from_pretrained() method to load the model from the safetensors format. However, it appears the method is searching for the usual model file formats (pytorch_model.bin, tf_model.h5, etc) instead of recognizing the safetensors format.

Has anyone faced this issue before or does anyone have insight into what might be happening here? I would greatly appreciate any assistance or guidance on how to successfully load a model stored in the safetensors format using the Transformers library.

Thank you!