Generate mutation by define specific stations

#15
by Aviv - opened

Hi,
First of all - thank you very much for sharing ProtGPT2.
I read the article as part of my thesis, and started playing with the model.

My question:
Is it possible to define which station/s in natural sequence I want ProtGPT2 will change? if yes - how?
For example:
I want to change only the 3th station in sequence, so that:
Natural seq: "DQSV..."
Possible generated mutant: "DQGV..."

Best Regards,
Aviv.

Hi Aviv,

Sorry I had missed your message. I’m afraid it is not possible to change directly some residues. What you could do though, is to input the first residues, and see what tokens the model considers most likely afterwards. This won’t directly mutate your sequence but it would tell you what residue is most likely after a set of residues. I haven’t tried this myself but it should be possible. Let me know how it goes!

Noelia

Hi Noelia,
I want to generate several sequences with "Cys" as the first residue.I wonder if it is possible to finetune the model with sequences like that. Or there are any other ways?

Thanks!
Gandi.

Hi Gandi11,

yes it would be possible. Find a dataset of sequences that start with 'C' and fine-tune the model. It should then most likely generate sequences that follow that pattern.

Best wishes

Sign up or log in to comment