openai/whisper-large-v3 · Hyperparams optimization with LoRA on Whisper

Hi, I have the following Python (3.10.12) packages installed:

torch === 2.3.1
torchaudio === 2.3.1
torchvision == 0.18.1
transformers == 4.41.2

and the following specs:

32 gb ram ddr5
4070 super 12 gb vram
ryzen 5 7600x
I'm trying to run a fine tuning task with PEFT LoRA on whisper-large-v3 locally on my GPU for a data transcription task (from audio to text), and I'm trying to do hyperparameter tuning according to this section: https://huggingface.co/docs/transformers/hpo_train

Here is the core part of source code that I use to do that:


model=WhisperForConditionalGeneration.from_pretrained("openai/whisper-large-v3")
processor=WhisperProcessor.from_pretrained("openai/whisper-large-v3", language="italian", task="transcribe")

from peft import prepare_model_for_kbit_training

model = prepare_model_for_kbit_training(model)

def make_inputs_require_grad(module, input, output):
    output.requires_grad_(True)

model.model.encoder.conv1.register_forward_hook(make_inputs_require_grad)

from peft import LoraConfig, PeftModel, LoraModel, LoraConfig, get_peft_model

config = LoraConfig(r=32, lora_alpha=64, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none")
model = get_peft_model(model, config)


forced_decoder_ids = processor.get_decoder_prompt_ids(language="italian", task="transcribe")
model.config.forced_decoder_ids = forced_decoder_ids
model.config.suppress_tokens = []
model.generation_config.language = "it"

### training arguments

training_args = Seq2SeqTrainingArguments(
    # per_device_train_batch_size=16, #commented cause I want to try optimization on this param
    per_device_eval_batch_size=4,
    gradient_accumulation_steps=2,
    learning_rate=1e-3,
    num_training_epochs=7,
    warmup_steps=50,
    eval_strategy="steps",
    fp16=True,
    logging_steps=50,
    generation_max_length=128,
    remove_unused_columns=False,
    label_names=["labels"]
)

# This callback helps to save only the adapter weights and remove the base model weights.
class SavePeftModelCallback(TrainerCallback):
    def on_save(
        self,
        args: TrainingArguments,
        state: TrainerState,
        control: TrainerControl,
        **kwargs,
    ):
        checkpoint_folder = os.path.join(args.output_dir, f"{PREFIX_CHECKPOINT_DIR}-{state.global_step}")

        peft_model_path = os.path.join(checkpoint_folder, "adapter_model")
        kwargs["model"].save_pretrained(peft_model_path)

        pytorch_model_path = os.path.join(checkpoint_folder, "pytorch_model.bin")
        if os.path.exists(pytorch_model_path):
            os.remove(pytorch_model_path)
        return control

trainer = Seq2SeqTrainer(
    args=training_args,
    model=model,
    train_dataset=final_ds["train"],
    eval_dataset=final_ds["val"],
    data_collator=data_collator,
    tokenizer=processor.feature_extractor,
    callbacks=[SavePeftModelCallback]
)

def optuna_hp_space(trial):
    return {
             "per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [8, 16])
    }

def compute_obj(metrics):
    return metrics["eval_loss"]

best_trials = trainer.hyperparameter_search(
        direction=["minimize"],
        hp_space=optuna_hp_space,
        n_trials=10,
        compute_objective=compute_obj

print(best_trials)

but it gives the error "RuntimeError: To use hyperparameter search, you need to pass your model through a model_init function."
Now, I've noticed that the tutorial I was following redirects to the class TrainerHyperParameterMultiObjectOptunaIntegrationTest in this file:
https://github.com/huggingface/transformers/blob/main/tests/trainer/test_trainer.py

but I think that the model_init function is not suitable for my case, so maybe I have to write custom source code. Can someone help me?