Text Generation
PEFT
Safetensors
llama-2
Eval Results

Code to upload finetuned model.

#1
by juanpablo4l - opened

Hi!
Could you perhaps share or add to your script your way of saving and uploading trained model to the hub? I encounter some exceptions when attempting to push_to_hub() the model, even after following other tutorials.
Thanks in advance!

# push to hub
model_id_load = ""

# tokenizer
tokenizer.push_to_hub(model_id_load, use_auth_token=True)
# safetensors
model.push_to_hub(model_id_load, use_auth_token=True, safe_serialization=True)
# torch tensors
model.push_to_hub(model_id_load, use_auth_token=True)
dfurman changed discussion status to closed

I am still missing something, because I get when trying to upload a model. This goes both for safetensors and pytorch.

Traceback (most recent call last):
  File "/builds/devops/gitlab-train-example/run.py", line 289, in <module>
    model.push_to_hub(model_id_load, use_auth_token=True, safe_serialization=True)
  File "/root/miniconda3/envs/par3/lib/python3.10/site-packages/transformers/utils/hub.py", line 814, in push_to_hub
    self.save_pretrained(work_dir, max_shard_size=max_shard_size, safe_serialization=safe_serialization)
  File "/root/miniconda3/envs/par3/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1715, in save_pretrained
    raise NotImplementedError(
NotImplementedError: You are calling `save_pretrained` on a 4-bit converted model. This is currently not supported

I tried using the peft installed from github (as in your code) and 0.4.0 (as suggested by someone else on the internet).
I guess that it's either something with the versions of the libraries that I'm using or I'm simply missing something in code and try to push the entire model instead of just the adapter.
Could you please share the output of your pip list/pip freeze?
Thanks for your effort!

juanpablo4l changed discussion status to open

The error message says it all, you can’t upload a model in 4bit. You need to load it in at a different precision/dtype. I loaded it in bfloat16 for example.

dfurman changed discussion status to closed

Sign up or log in to comment