Unable to fine-tune model with n_hyperopt_trials > 0. cc.validate gets stuck

#587

by Sheyiphunmi - opened 7 days ago

I’m trying to fine-tune my model using the validate function. The code runs correctly when I set n_hyperopt_trials = 0 (direct training, no hyperparameter optimization). However, when I set n_hyperopt_trials > 0, the program starts but never completes. It does not crash or throw an error; it remains stuck, as seen in the attached image.

fineTune = True

all_metrics = cc.validate(
    model_directory=modelPath,
    prepared_input_data_file=f"{saveDir}/{outputPrefix}_labeled_train.dataset",
    id_class_dict_file=f"{saveDir}/{outputPrefix}_id_class_dict.pkl",
    output_directory=saveDir,
    output_prefix=outputPrefix,
    attr_to_split=attr_to_split,
    attr_to_balance=attr_to_balance,
    n_hyperopt_trials=2 if fineTune else 0,  # Set to 0 for direct training without hyperparameter optimization
    save_eval_output=True,
)

ctheodoris

Owner 4 days ago

Thank you for your question. It's unclear without a more specific error what is causing this issue, but setting up Ray can be quite system dependent. We suggest making sure Ray works on your system separately from the Geneformer training to help debug, or alternatively using the multitask fine tuning method provided here (set up as a single task), which uses Optuna for hyperparameter tuning.

ctheodoris changed discussion status to closed 4 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment