Text Generation
Transformers
PyTorch
mistral
openchat
C-RLFT
conversational
Inference Endpoints
text-generation-inference

Which is the actual way to store the Adapter after PEFT finetuning

#42
by Pradeep1995 - opened

I am finetuning the mistral model using the following configurations

training_arguments = TrainingArguments(
    output_dir=output_dir,
    per_device_train_batch_size=per_device_train_batch_size,
    gradient_accumulation_steps=gradient_accumulation_steps,
    optim=optim,
    save_steps=save_steps,
    logging_strategy="steps",
    logging_steps=10,
    learning_rate=learning_rate,
    weight_decay=weight_decay,
    fp16=fp16,
    bf16=bf16,
    max_grad_norm=max_grad_norm,
    max_steps=13000,
    warmup_ratio=warmup_ratio,
    group_by_length=group_by_length,
    lr_scheduler_type=lr_scheduler_type
)
trainer = SFTTrainer(
    model=peft_model,
    train_dataset=data,
    peft_config=peft_config,
    dataset_text_field=" column name",
    max_seq_length=3000,
    tokenizer=tokenizer,
    args=training_arguments,
    packing=packing,
)
trainer.train()

during this training I am getting the multiple checkpoints in the specified output directory output_dir.

Once the model training is over I can save the model using

trainer.save_model()

Not only that i can save the final model using

trainer.model.save_pretrained("path")

So I bit confused. Which is the actual way to store the adapter after PEFT based lora fine-tuning

whether it is
1 - Take the least loss checkpoint folder from the output_dir
or
2 - save the adapter using

trainer.save_model()

or
3 - this method

trainer.model.save_pretrained("path")

Sign up or log in to comment