tm9d8s_fitness / README.md
FarmerTao's picture
Update README.md
bf3889c verified
metadata
base_model: westlake-repl/SaProt_35M_AF2
library_name: peft

Base model: westlake-repl/SaProt_35M_AF2

Model Card for Model ID

This model is used to predict fitness of mutant β-subunit of tryptophan synthase (TrpB). TrpB synthesizes L-tryptophan (Trp) from indole and L-serine (Ser). TrpB variant Tm9D8*, derived from the hyperthermophile Thermotoga maritima, was selected as the parent enzyme. Tm9D8* differs from wildtype TmTrpB by ten amino acid substitutions (P19G, E30G, I69V, K96L, P140L, N167D, I184F, L213P, G228S, and T292S).

Task type

protein level regression

Dataset description

The dataset is from A combinatorially complete epistatic fitness landscape in an enzyme active site.

The dataset can also be found at SaProtHub dataset.

Label means mutation fitness, here represents growth rate of E. coli strain. The maximum fitness is 1, the closer to 1, the better fitness.

Model input type

Amino acid sequence

Performance

test_pearson: 0.93

test_spearman: 0.38

LoRA config

lora_dropout: 0.0

lora_alpha: 16

target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"]

modules_to_save: ["classifier"]

Training config

class: AdamW

betas: (0.9, 0.98)

weight_decay: 0.01

learning rate: 5e-4

epoch: 100

batch size: 100

precision: 16-mixed