Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

BARTReact

BARTReact model presented in "BARTReact: SELFIES-Driven Precision in Reaction Modeling" https://doi.org/10.1016/j.fraope.2024.100106.
This model is able to predict reaction products from reactants represented as SELFIES.

Model Details

Model Description

  • Model type: BART
  • Language(s) (NLP): SELFIES

Dataset

Dataset in SMILES can be found in https://www.rhea-db.org/.
SMILES to SELFIES conversion was made from selfies package available at https://github.com/aspuru-guzik-group/selfies.

How to Get Started with the Model

Use the code below to get started with the model.


from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("DanielFarfan/BARTReact")
model = AutoModelForSeq2SeqLM.from_pretrained("DanielFarfan/BARTReact")


sf_input = tokenizer("[C][C][Branch1][C][C][Branch2][Branch1][#Branch1][C][O][P][=Branch1][C]"\
                     "[=O][Branch1][C][O-1][O][P][=Branch1][C][=O][Branch1][C][O-1][O][C][C@H1]"\
                     "[O][C@@H1][Branch1][#C][N][C][=N][C][=C][Ring1][Branch1][N][=C][N][=C][Ring1]"\
                     "[=Branch1][N][C@H1][Branch1][C][O][C@@H1][Ring1][S][O][P][=Branch1][C][=O]"\
                     "[Branch1][C][O-1][O-1][C@@H1][Branch1][C][O][C][=Branch1][C][=O][N][C][C][C]"\
                     "[=Branch1][C][=O][N][C][C][S].[C][S][C][C][C][Branch1][C][O][Branch1][#Branch1]"\
                     "[C][C][=Branch1][C][=O][O-1][C][=Branch1][C][=O][O-1].[H+1]", return_tensors="pt")
# beam search
molecules = model.generate(input_ids=sf_input["input_ids"],
                           attention_mask=sf_input["attention_mask"],
                           max_length=400,
                           min_length=5,
                           num_return_sequences=3,#Modify this to get more results
                           num_beams=5)
sf_output = [tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True).replace(" ","") for g in molecules]

['[C][C][=Branch1][C][=O][S][C][C][N][C][=Branch1][C][=O][C][C][N][C][=Branch1][C][=O][C@H1][Branch1][C][O][C][Branch1][C][C][Branch1][C][C][C][O][P][=Branch1][C][=O][Branch1][C][O-1][O][P][=Branch1][C][=O][Branch1][C][O-1][O][C][C@H1][O][C@@H1][Branch1][#C][N][C][=N][C][=C][Ring1][Branch1][N][=C][N][=C][Ring1][=Branch1][N][C@H1][Branch1][C][O][C@@H1][Ring1][S][O][P][=Branch1][C][=O][Branch1][C][O-1][O-1].[C][S][C][C][C][=Branch1][C][=O][C][=Branch1][C][=O][O-1].[H][O][H]',
 '[C][C][=Branch1][C][=O][S][C][C][N][C][=Branch1][C][=O][C][C][N][C][=Branch1][C][=O][C@H1][Branch1][C][O][C][Branch1][C][C][Branch1][C][C][C][O][P][=Branch1][C][=O][Branch1][C][O-1][O][P][=Branch1][C][=O][Branch1][C][O-1][O][C][C@H1][O][C@@H1][Branch1][#C][N][C][=N][C][=C][Ring1][Branch1][N][=C][N][=C][Ring1][=Branch1][N][C@H1][Branch1][C][O][C@@H1][Ring1][S][O][P][=Branch1][C][=O][Branch1][C][O-1][O-1].[C][S][C][C][=Branch1][C][=O][C][=Branch1][C][=O][O-1].[H][O][H]',
 '[C][C][Branch1][C][C][Branch2][Branch1][#Branch1][C][O][P][=Branch1][C][=O][Branch1][C][O-1][O][P][=Branch1][C][=O][Branch1][C][O-1][O][C][C@H1][O][C@@H1][Branch1][#C][N][C][=N][C][=C][Ring1][Branch1][N][=C][N][=C][Ring1][=Branch1][N][C@H1][Branch1][C][O][C@@H1][Ring1][S][O][P][=Branch1][C][=O][Branch1][C][O-1][O-1][C@@H1][Branch1][C][O][C][=Branch1][C][=O][N][C][C][C][=Branch1][C][=O][N][C][C][S][C][=Branch1][C][=O][C][C][C][=Branch1][C][=O][O-1].[C][S][C][C][C][=Branch1][C][=O][O-1].[H][O][H]']

Model Card Contact

Daniel Farfán: marcos.daniel.rodriguez@gmail.com

Downloads last month
8
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.