Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

BARTReact

BARTReact model presented in "BARTReact: SELFIES-Driven Precision in Reaction Modeling" https://doi.org/10.1016/j.fraope.2024.100106.
This model is able to predict reaction products from reactants represented as SELFIES.

Model Details

Model Description

  • Model type: BART
  • Language(s) (NLP): SELFIES

Dataset

Dataset in SMILES can be found in https://www.rhea-db.org/.
SMILES to SELFIES conversion was made from selfies package available at https://github.com/aspuru-guzik-group/selfies.

How to Get Started with the Model

Use the code below to get started with the model.


from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("DanielFarfan/BARTReact")
model = AutoModelForSeq2SeqLM.from_pretrained("DanielFarfan/BARTReact")


sf_input = tokenizer("[C][C][Branch1][C][C][Branch2][Branch1][#Branch1][C][O][P][=Branch1][C]"\
                     "[=O][Branch1][C][O-1][O][P][=Branch1][C][=O][Branch1][C][O-1][O][C][C@H1]"\
                     "[O][C@@H1][Branch1][#C][N][C][=N][C][=C][Ring1][Branch1][N][=C][N][=C][Ring1]"\
                     "[=Branch1][N][C@H1][Branch1][C][O][C@@H1][Ring1][S][O][P][=Branch1][C][=O]"\
                     "[Branch1][C][O-1][O-1][C@@H1][Branch1][C][O][C][=Branch1][C][=O][N][C][C][C]"\
                     "[=Branch1][C][=O][N][C][C][S].[C][S][C][C][C][Branch1][C][O][Branch1][#Branch1]"\
                     "[C][C][=Branch1][C][=O][O-1][C][=Branch1][C][=O][O-1].[H+1]", return_tensors="pt")
# beam search
molecules = model.generate(input_ids=sf_input["input_ids"],
                           attention_mask=sf_input["attention_mask"],
                           max_length=400,
                           min_length=5,
                           num_return_sequences=3,#Modify this to get more results
                           num_beams=5)
sf_output = [tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True).replace(" ","") for g in molecules]

['[C][C][=Branch1][C][=O][S][C][C][N][C][=Branch1][C][=O][C][C][N][C][=Branch1][C][=O][C@H1][Branch1][C][O][C][Branch1][C][C][Branch1][C][C][C][O][P][=Branch1][C][=O][Branch1][C][O-1][O][P][=Branch1][C][=O][Branch1][C][O-1][O][C][C@H1][O][C@@H1][Branch1][#C][N][C][=N][C][=C][Ring1][Branch1][N][=C][N][=C][Ring1][=Branch1][N][C@H1][Branch1][C][O][C@@H1][Ring1][S][O][P][=Branch1][C][=O][Branch1][C][O-1][O-1].[C][S][C][C][C][=Branch1][C][=O][C][=Branch1][C][=O][O-1].[H][O][H]',
 '[C][C][=Branch1][C][=O][S][C][C][N][C][=Branch1][C][=O][C][C][N][C][=Branch1][C][=O][C@H1][Branch1][C][O][C][Branch1][C][C][Branch1][C][C][C][O][P][=Branch1][C][=O][Branch1][C][O-1][O][P][=Branch1][C][=O][Branch1][C][O-1][O][C][C@H1][O][C@@H1][Branch1][#C][N][C][=N][C][=C][Ring1][Branch1][N][=C][N][=C][Ring1][=Branch1][N][C@H1][Branch1][C][O][C@@H1][Ring1][S][O][P][=Branch1][C][=O][Branch1][C][O-1][O-1].[C][S][C][C][=Branch1][C][=O][C][=Branch1][C][=O][O-1].[H][O][H]',
 '[C][C][Branch1][C][C][Branch2][Branch1][#Branch1][C][O][P][=Branch1][C][=O][Branch1][C][O-1][O][P][=Branch1][C][=O][Branch1][C][O-1][O][C][C@H1][O][C@@H1][Branch1][#C][N][C][=N][C][=C][Ring1][Branch1][N][=C][N][=C][Ring1][=Branch1][N][C@H1][Branch1][C][O][C@@H1][Ring1][S][O][P][=Branch1][C][=O][Branch1][C][O-1][O-1][C@@H1][Branch1][C][O][C][=Branch1][C][=O][N][C][C][C][=Branch1][C][=O][N][C][C][S][C][=Branch1][C][=O][C][C][C][=Branch1][C][=O][O-1].[C][S][C][C][C][=Branch1][C][=O][O-1].[H][O][H]']

Model Card Contact

Daniel Farfán: marcos.daniel.rodriguez@gmail.com

Downloads last month
1