Model Details
This is a fine-tuned ChemBERTa model trained for activity classification of potential NLRP3 inflammasome inhibitors.
ChemBERTaNLRP3 was pre-trained on all currently known NLRP3 inhibitor data.
Supporting training and data files can be found in this GitHub repository: https://github.com/VitaRin/ChemBERTaNLRP3
Original pre-trained ChemBERTa model from which this model was fine-tuned can be found here: https://huggingface.co/seyonec/ChemBERTa-zinc250k-v1
Example Use
This model can be further fine-tuned on more inhibitor data or used directly in a classification pipeline.
A simple example usage of this model for molecular classification tasks:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline
import pandas as pd
from datasets import Dataset
model_name = "VitaRin/ChemBERTaNLRP3"
smiles_data = ""
pipeline = TextClassificationPipeline(
model=AutoModelForSequenceClassification.from_pretrained(model_name),
tokenizer=AutoTokenizer.from_pretrained(model_name),
device=0
)
test_df = pd.read_csv(smiles_data)
test_dataset = Dataset.from_pandas(test_df)
molecules = list(test_dataset["text"])
result = pipeline(molecules)
print(result)```