How to overcome the Runtime error (OSError)

by vsrinivas - opened Oct 6, 2023

Oct 6, 2023

•

edited Oct 6, 2023

Hi, I am trying to use transformers version of the model using the following commands. However, this is throwing the runtime error. Appreciate any inputs on how to overcome this error. I am not finding any useful documentation.

model = AutoModelForCausalLM.from_pretrained(
    "amazon/FalconLite2", device_map="auto", offload_folder="offload", 
    trust_remote_code=True,
    # torch_dtype="auto",
)

 OSError: amazon/FalconLite2 does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.

vsrinivas changed discussion title from How to overcome the runtime error to How to overcome the Runtime error (OSError) Oct 6, 2023

vsrinivas changed discussion status to closed Oct 6, 2023

vsrinivas changed discussion status to open Oct 6, 2023

mb-datalab2023

Oct 6, 2023

Hi vsrinivas,

Actually the error is pretty explicit. It says that this model doesn't have any of the files (pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack) it was expecting.

Try using this code : https://huggingface.co/docs/safetensors/index
(I'll try to download it and use it myself)

Regards,

vsrinivas

Oct 6, 2023

hi @mb-datalab2023 , I don't see a mention of the error or any of those files at this link https://huggingface.co/docs/safetensors/index. If you find the solution, appreciate if you can let me know.

vsrinivas

Oct 6, 2023

@chenwuml and @yinsong1986 Could you please help with this? I am trying to run in a colab notebook.

mb-datalab2023

Oct 6, 2023

•

edited Oct 6, 2023

Hello @vsrinivas ,

I think I found the solution to your problem.

You need to download amazon/FalconLite2
You need to rename gptq_model-4bit-128g.safetensors into model .safetensors
Then use the use_safetensors=True argument in the method AutoModelForCausalLM.from_pretrained. It should look like this :

from transformers import AutoTokenizer, AutoModelForCausalLM
file_path = "C:/Downloads/falconlite2/"

model = AutoModelForCausalLM.from_pretrained(
    file_path, device_map="auto", use_safetensors=True, 
    trust_remote_code=True,
    # torch_dtype="auto",
)

I hope this helps,

If you get an error, try pip install safetensors jax jaxlib.

EDIT #1 :

Maybe, you can just rename gptq_model-4bit-128g.safetensors into model .safetensors. Like for BERT model : https://huggingface.co/bert-base-uncased/tree/main
(Here I just downloaded the model .safetensors and it works just fine).

EDIT #2 : I was not able to run on my computer, I didn't have enough RAM.

Regards

mb-datalab2023

Oct 10, 2023

Hello again,

I tried this code on a machine with 256GB of RAM and a A100 GPU with 80 GB of vRAM and I had sereval errors. One of the errors was that I didn't have TensorFlow installed, eventhough I installed it, the code didn't work.

Can you please help us ?
Many Thanks !

vsrinivas

Oct 10, 2023

@mb-datalab2023 Which environment are you trying (local laptop or cloud like Colab) and what is the complete code that you tried?

mb-datalab2023

Oct 10, 2023

@vsrinivas I am working on a private cloud in the company I work for.
Basically it is a Linux machine with the following specs : 256GB of RAM and a A100 GPU with 80 GB of vRAM

vsrinivas

Oct 12, 2023

@mb-datalab2023 if you have installed and imported the necessary libraries and classes, it shall work. As you know, it is difficult to understand the problem unless the code used and the full error message is shared.

mb-datalab2023

Oct 12, 2023

Thanks for your answer @vsrinivas .
Yes, of course I know that it is hard to debug for a third party without the full error message or the code.

I'll try to rerun it and provide the needed info for debug.
Regards,

elboertjie

Oct 12, 2023

Hello again,

I tried this code on a machine with 256GB of RAM and a A100 GPU with 80 GB of vRAM and I had sereval errors. One of the errors was that I didn't have TensorFlow installed, eventhough I installed it, the code didn't work.

Can you please help us ?
Many Thanks !

Hi, if you use that much resources, I wonder if this Lite model will run on a 24GB GPU with 128GB of system RAM?

mb-datalab2023

Oct 12, 2023

@elboertjie
Hi, I think it might work on the machine you have.
Actually, as a general rule of thumb you need X GB amount of vRAM (RAM of GPU). Where X = nbr_parameters x nbr_of_Bytes_per_parameters (generally paramter are float16 bits which is 2 bytes).

So for flacon 40B which has 40B parameters, you need 80GB of vRAM = 40B x 2 Bytes.

For falconlite2, as far as I know the parameters have quantized into 4 bits (0.5 bytes). So in theory, you need 40B x 0.5 = 20 GB vRAM.
Since you 24GB GPU, I think it's ok.

Wish you the best of luck,
Regards

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment