Fine-tuned ESM2 Protein Classifier (pdac_pred_llm)
This repository contains a fine-tuned ESM2 model for protein sequence classification, specifically the model uploaded to shubhamc-iiitd/pdac_pred_llm
. The model is trained to predict binary labels based on protein sequences.
Model Description
- Base Model: ESM2-t33-650M-UR50D (Fine-tuned)
- Fine-tuning Task: Binary protein classification.
- Architecture: The model consists of the ESM2 backbone with a linear classification head.
- Input: Protein amino acid sequences.
- Output: Binary classification labels (0 or 1).
Repository Contents
pytorch_model.bin
: The trained model weights.alphabet.bin
: The ESM2 alphabet (used as a tokenizer).config.json
: Configuration file for the model.README.md
: This file.
Usage
Installation
Install the required libraries:
pip install torch esm biopython huggingface_hub
Loading the Model from Hugging Face
import torch
import torch.nn as nn
import esm
from huggingface_hub import hf_hub_download
import json
# Define the model architecture (same as during training)
class ProteinClassifier(nn.Module):
def __init__(self, esm_model, embedding_dim, num_classes):
super(ProteinClassifier, self).__init__()
self.esm_model = esm_model
self.fc = nn.Linear(embedding_dim, num_classes)
def forward(self, tokens):
with torch.no_grad():
results = self.esm_model(tokens, repr_layers=[33])
embeddings = results["representations"][33].mean(1)
output = self.fc(embeddings)
return output
# Download the model files from Hugging Face
repo_id = "shubhamc-iiitd/pdac_pred_llm"
model_weights_path = hf_hub_download(repo_id=repo_id, filename="pytorch_model.bin")
alphabet_path = hf_hub_download(repo_id=repo_id, filename="alphabet.bin")
config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
# Load the ESM2 model (used as backbone)
model, alphabet = esm.pretrained.esm2_t33_650M_UR50D()
# Load the configuration
with open(config_path, 'r') as f:
config = json.load(f)
# Initialize the classifier
classifier = ProteinClassifier(model, embedding_dim=config['embedding_dim'], num_classes=config['num_classes'])
# Load the model weights
classifier.load_state_dict(torch.load(model_weights_path))
classifier.eval()
# Load the alphabet
alphabet = torch.load(alphabet_path)
batch_converter = alphabet.get_batch_converter()
#Move models to device if needed
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
- Downloads last month
- 18
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for shubhamc-iiitd/pdac_pred_llm
Base model
facebook/esm2_t6_8M_UR50D