Unable to fine-tune model with n_hyperopt_trials > 0. cc.validate gets stuck

#587
by Sheyiphunmi - opened

I’m trying to fine-tune my model using the validate function. The code runs correctly when I set n_hyperopt_trials = 0 (direct training, no hyperparameter optimization). However, when I set n_hyperopt_trials > 0, the program starts but never completes. It does not crash or throw an error; it remains stuck, as seen in the attached image.

fineTune = True

all_metrics = cc.validate(
    model_directory=modelPath,
    prepared_input_data_file=f"{saveDir}/{outputPrefix}_labeled_train.dataset",
    id_class_dict_file=f"{saveDir}/{outputPrefix}_id_class_dict.pkl",
    output_directory=saveDir,
    output_prefix=outputPrefix,
    attr_to_split=attr_to_split,
    attr_to_balance=attr_to_balance,
    n_hyperopt_trials=2 if fineTune else 0,  # Set to 0 for direct training without hyperparameter optimization
    save_eval_output=True,
)

IMG_3452

Thank you for your question. It's unclear without a more specific error what is causing this issue, but setting up Ray can be quite system dependent. We suggest making sure Ray works on your system separately from the Geneformer training to help debug, or alternatively using the multitask fine tuning method provided here (set up as a single task), which uses Optuna for hyperparameter tuning.

ctheodoris changed discussion status to closed

Sign up or log in to comment