Running MPT-7B with LORA
#1
by
kreabs
- opened
I am following this tutorial by Phillip Schmid
https://www.philschmid.de/fine-tune-flan-t5-peft
But when i exchange the flan model with either :
flashvenom/mpt-7b-base-lora-fix or
mosaicml/mpt-7b
it doesnt work.
I adjusted these two lines:
model = AutoModelForCausalLM.from_pretrained(model_id, load_in_8bit=True, trust_remote_code=True)
and this before the preprocess_function:
tokenizer.pad_token = tokenizer.eos_token
what else do i have to adjust or has somebody managed to let a script run?
i'd be happy for any code example or explanation:
Greetings
kreabs