Running MPT-7B with LORA

#1
by kreabs - opened

I am following this tutorial by Phillip Schmid
https://www.philschmid.de/fine-tune-flan-t5-peft

But when i exchange the flan model with either :
flashvenom/mpt-7b-base-lora-fix or
mosaicml/mpt-7b

it doesnt work.

I adjusted these two lines:
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_8bit=True, trust_remote_code=True)
and this before the preprocess_function:
tokenizer.pad_token = tokenizer.eos_token

what else do i have to adjust or has somebody managed to let a script run?

i'd be happy for any code example or explanation:

Greetings
kreabs

Sign up or log in to comment