This model is superseded by https://github.com/ORNL/affinity_pred
jglaser/protein-ligand-mlp-2
This is a sentence-transformers model: It maps pairs of protein and chemical sequences (canonical SMILES) onto binding affinities (pIC50 values).
Each member of the ensemble has been trained using a different seed and you can use the different models as independent samples to estimate the uncertainty.
Usage (Sentence-Transformers)
Using this model becomes easy when you have sentence-transformers installed:
#pip install -U sentence-transformers
pip install git+https://github.com/jglaser/sentence-transformers.git@enable_mixed
Then you can use the model like this:
from sentence_transformers import SentenceTransformer
sentences = [{'protein': ["SEQVENCE"], 'ligand': ["c1ccccc1"]}]
model = SentenceTransformer('jglaser/protein-ligand-mlp-2')
embeddings = model.encode(sentences)
print(embeddings)
Evaluation Results
Full Model Architecture
SentenceTransformer(
(0): Asym(
(protein-0): Transformer({'max_seq_length': 2048, 'do_lower_case': False}) with Transformer model: BertModel
(protein-1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
(protein-2): Dense({'in_features': 1024, 'out_features': 1024, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
(ligand-0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(ligand-1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
(ligand-2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
)
(1): Dense({'in_features': 1792, 'out_features': 1000, 'bias': True, 'activation_function': 'torch.nn.modules.activation.GELU'})
(2): Dense({'in_features': 1000, 'out_features': 1000, 'bias': True, 'activation_function': 'torch.nn.modules.activation.GELU'})
(3): Dense({'in_features': 1000, 'out_features': 1000, 'bias': True, 'activation_function': 'torch.nn.modules.activation.GELU'})
(4): Dense({'in_features': 1000, 'out_features': 1, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
(5): Dense({'in_features': 1, 'out_features': 1, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
)
Citing & Authors
- Andrew E Blanchard
- John Gounley
- Debsindhu Bhowmik
- Mayanka Chandra Shekar
- Isaac Lyngaas
- Shang Gao
- Junqi Yin
- Aristeidis Tsaris
- Feiyi Wang
- Jens Glaser
Find more information in our bioRxiv preprint
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.