Caduceus for Transfer Learning

#1
by Sethulakshmi - opened

I would like to use the Caduceus model as a pre-trained model for detecting if an input DNA sequence in Fasta is affected by a genetic disease or not. I have very less datasets for affected and unaffected files for each related gene to the disease. I am finding difficulty in compiling the model due to its compile function.

image.png

Kuleshov Group org

Our model was trained with pytorch but it looks like you are using keras, I don't think our model will be compatible with this code snippet.

Can you suggest how i could include the mode for transfer learning and train a bit with my own fasta dataset.

Kuleshov Group org

Are you able to use pytorch? If so you can load the model from HF using the steps in README here, e.g.

model_name = "kuleshov-group/caduceus-ps_seqlen-131k_d_model-256_n_layer-16"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForMaskedLM.from_pretrained(model_name)

then you can use model to train within your training loop using your dataset.

Let me know if that helps clarify things.

Sign up or log in to comment