arxiv:2407.00111

Accurate Prediction of Ligand-Protein Interaction Affinities with Fine-Tuned Small Language Models

Published on Jun 27

· Submitted by

BFauber on Jul 2

Upvote

Authors:

Ben Fauber

Abstract

We describe the accurate prediction of ligand-protein interaction (LPI) affinities, also known as drug-target interactions (DTI), with instruction fine-tuned pretrained generative small language models (SLMs). We achieved accurate predictions for a range of affinity values associated with ligand-protein interactions on out-of-sample data in a zero-shot setting. Only the SMILES string of the ligand and the amino acid sequence of the protein were used as the model inputs. Our results demonstrate a clear improvement over machine learning (ML) and free-energy perturbation (FEP+) based methods in accurately predicting a range of ligand-protein interaction affinities, which can be leveraged to further accelerate drug discovery campaigns against challenging therapeutic targets.

View arXiv page View PDF Add to collection

Community

BFauber

Paper author Paper submitter 3 days ago

This paper describes the use of fine-tuned small language models to accurately predict biological interactions, and is useful in prioritizing small molecules for progression in drug discovery campaigns. The generality of the method is shown to improve as the fine-tuning data set size increases. This work also highlights the importance of, and potential business impact of, generative models in assisting with the prioritization of activities and workstreams outside of the traditional text-generation/NLP space.

nielsr

3 days ago

Hi @BFauber congrats on this work! Are you planning on sharing any artifacts on the hub (e.g. you could upload models and link them to this paper page), see the following resources:

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2407.00111 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2407.00111 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2407.00111 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.