---
base_model: westlake-repl/SaProt_35M_AF2
library_name: peft
---
# Base model: [westlake-repl/SaProt_35M_AF2](https://huggingface.co/westlake-repl/SaProt_35M_AF2)

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->
This model is trained on a sigle site deep mutation scanning dataset and
can be used to predict fitness score of mutant amino acid sequence of protein [YAP1_HUMAN](https://www.uniprot.org/uniprotkb/P46937/entry) (Transcriptional coactivator YAP1). 

## Protein Function
Transcriptional regulator with dual roles as a coactivator and corepressor. Critical downstream regulatory target in the Hippo signaling pathway,
crucial for organ size control and tumor suppression by restricting proliferation and promoting apoptosis.

### Task type
protein level regression

### Dataset description
The dataset is from [Deep generative models of genetic variation capture the effects of mutations](https://www.nature.com/articles/s41592-018-0138-4).
And can also be found on [SaprotHub dataset](https://huggingface.co/datasets/SaProtHub/DMS_YAP1_HUMAN).

Label means fitness score of each mutant amino acid sequence, ranging from negative infinity to positive infinity.
The value of wildtype mutant is 1, larger value means higher fitness. 
### Model input type
Amino acid sequence

### Performance
0.82 Spearman's ρ

### LoRA config
lora_dropout: 0.0

lora_alpha: 16

target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"]

modules_to_save: ["classifier"]

### Training config
class: AdamW

betas: (0.9, 0.98)

weight_decay: 0.01

learning rate: 1e-4

epoch: 100

batch size: 2

precision: 16-mixed