Unable to get inference API working, trust_remote_code needs to be True

#41
by medmac01 - opened

Hi there,
I successfully finetuned the falcon-7b and pushed it to the hub. However, the HF inference API isn't working.. I get this error "Loading medmac01/moroccan-qa-falcon-7b requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code=True to remove this error."

I tried to load the model locally while setting trust_remote_code=True, and it works normal in the notebook, but i'm unable to get it running on the inference API.

import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type='nf4',
    bnb_4bit_compute_dtype=torch.bfloat16,
    )

peft_model_id = "medmac01/moroccan-qa-falcon-7b"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path =  config.base_model_name_or_path,
    return_dict=True, 
    load_in_8bit=True, 
    device_map={"":0},
    quantization_config=bnb_config,
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path,trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)

Sign up or log in to comment