Error when running AWS train Job

#1
by mikeagb - opened

I get the following error when trying to run AutoModelForCausalLM.from_pretrained from AWS train job:

ErrorMessage "β”‚ ❱ 454 β”‚ β”‚ raise EnvironmentError( β”‚
β”‚ 455 β”‚ β”‚ β”‚ f"{path_or_repo_id} does not appear to have a file named β”‚
β”‚ 456 β”‚ β”‚ β”‚ f"'https://huggingface.co/{path_or_repo_id}/{revision}' f β”‚
β”‚ 457 β”‚ β”‚ ) β”‚
╰──────────────────────────────────────────────────────────────────────────────╯
OSError: vilsonrodrigues/falcon-7b-sharded does not appear to have a file named
tiiuae/falcon-7b--configuration_RW.py. Checkout
'https://huggingface.co/vilsonrodrigues/falcon-7b-sharded/main' for available
files."

Hello, can you provide the code you tried to use?

thanks for your reply. This was how I called the model:

quant_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)

model = AutoModelForCausalLM.from_pretrained(
    "vilsonrodrigues/falcon-7b-sharded",
    quantization_config=quant_config,
    trust_remote_code=True,
    device_map="auto", 
    offload_folder="offload",
    force_download=True,
)

I've been able to load the model in this ay in the past, but for some reason getting this error from sagemaker training job using the hugging face DLC
using transformer 4.28, torch 2.0 , python 3.10

try It

transformers>=4.30.2

4.28 does not support

vilsonrodrigues changed discussion status to closed

Sign up or log in to comment