--- base_model: westlake-repl/SaProt_35M_AF2 library_name: peft --- # Base model: [westlake-repl/SaProt_35M_AF2](https://huggingface.co/westlake-repl/SaProt_35M_AF2) # Model Card for Model ID This model is used to predict fitness of GB1 protein variants. ### Task type protein level regression ### Dataset description The dataset is from: Nicholas C Wu, Lei Dai, C Anders Olson, James O Lloyd-Smith, Ren Sun (2016) Adaptation in protein fitness landscapes is facilitated by indirect paths eLife 5:e16965 https://doi.org/10.7554/eLife.16965 Label is the fitness of mutant protein. The fitness of each variant can be viewed as the fitness relative to wildtype, such that = 1. Therefore all labels are larger than 0, if label >1 means high fitness compare to wildtype. ### Model input type Amino acid sequence ### Performance test_spearman: 0.54 test_pearson: 0.98 ### LoRA config lora_dropout: 0.0 lora_alpha: 16 target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"] modules_to_save: ["classifier"] ### Training config class: AdamW betas: (0.9, 0.98) weight_decay: 0.01 learning rate: 1e-3 epoch: 20 batch size: 1000 precision: 16-mixed