Edit model card

BERT_SMILES_LARGE

This model is a 83.5M parameter ROBERTA model fine tuned on a dataset of 1.1M SMILES (Simplified molecular-input line-entry system) for masked language modeling (MLM). This model builds on BERT_SMILES which was fine tuned on only 50k SMILES.

Evaluation Loss: 0.482

Example:

Morphine

CN1CC[C@]23[C@@H]4[C@H]1CC5=C2C(=C(C=C5)O)O[C@H]3[C@H](C=C4)O

Intended uses & limitations

This model can now be used to predict physical or chemical properties with further training.

Framework versions

  • Transformers 4.37.0.dev0
  • Pytorch 2.1.0+cu121
  • Tokenizers 0.15.0
Downloads last month
77
Safetensors
Model size
83.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.