Edit model card
YAML Metadata Error: "datasets[1]" with value "Custom Rosetta" is not valid. If possible, use a dataset id from https://hf.co/datasets.
YAML Metadata Error: "language" with value "protein" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.

ProtBert-BFD finetuned on Rosetta 20AA dataset

This model is finetuned to predict Rosetta fold energy using a dataset of 100k 20AA sequences.

Current model in this repo: prot_bert_bfd-finetuned-032722_1752

Performance

  • 20AA sequences (1k eval set):
    Metrics: 'mae': 0.090115, 'r2': 0.991208, 'mse': 0.013034, 'rmse': 0.114165

  • 40AA sequences (10k eval set):
    Metrics: 'mae': 0.537456, 'r2': 0.659122, 'mse': 0.448607, 'rmse': 0.669781

  • 60AA sequences (10k eval set):
    Metrics: 'mae': 0.629267, 'r2': 0.506747, 'mse': 0.622476, 'rmse': 0.788972

prot_bert_bfd from ProtTrans

The starting pretrained model is from ProtTrans, trained on 2.1 billion proteins from BFD. It was trained on protein sequences using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository.

Created by Ladislav Rampasek

Downloads last month
10