AMP GenPept Binary Classifier (ESM-2 650M + LoRA)

Fine-tuned ESM-2 650M with LoRA for binary antimicrobial peptide (AMP) classification.

Performance

  • F1: 0.883 (88.3%)
  • Accuracy: 0.868 (86.8%)
  • Benchmark: GenPept-Curated-2025 (11K sequences, 80/20 split)
  • Training: 5 epochs, A6000 48GB, ~40 min

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel
import torch

model = AutoModelForSequenceClassification.from_pretrained("facebook/esm2_t33_650M_UR50D", num_labels=1)
model = PeftModel.from_pretrained(model, "null-phnix/amp-genpept-esm2-650m-lora")
tokenizer = AutoTokenizer.from_pretrained("facebook/esm2_t33_650M_UR50D")

def predict_amp(sequence: str) -> float:
    """Return AMP probability for a peptide sequence."""
    inputs = tokenizer(sequence, return_tensors="pt", truncation=True, padding="max_length", max_length=200)
    with torch.no_grad(): logits = model(**inputs).logits
    return torch.sigmoid(logits).item()

print(predict_amp("GLFDVIKKVAGALGSLVK"))

Architecture

  • Base: ESM-2 650M (33 transformer layers, 1280 hidden dim)
  • Adapter: LoRA r=16, alpha=32, target_modules=["query","value"]
  • Head: Single sigmoid output for binary classification

Data

Trained on GenPept-Curated-2025, a balanced, leakage-free AMP benchmark.

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for null-phnix/amp-genpept-esm2-650m-lora

Adapter
(21)
this model