--- tags: - generated_from_trainer model-index: - name: SMILES_BERT results: [] widget: - text: O=C([C@@H](c1ccc(cc1)O)N)[C@@H]1C(=O)N2[C@@H]1SC([C@@H]2C(=O)O)(C)C pipeline_tag: fill-mask --- # SMILES_BERT A BERT model trained on a list of 50,000 SMILES for MLM Example: Amoxicillin ``` O=C([C@@H](c1ccc(cc1)O)N)N[C@@H]1C(=O)N2[C@@H]1SC([C@@H]2C(=O)O)(C)C ``` ## Model description This model is a BERT model that was trained on a list of 50k SMILES. The SMILES were sourced from BindingDB and the compounds bind to certain proteins with some affinity. The purpose of this model was to provide a model which can then be fine-tuned for other tasks in which SMILES data can be useful. ## Intended uses & limitations This model was trained in order to provide a model which can then be fine-tuned for other tasks in which SMILES data can be useful such as predicting physical properties, chemical activity, or biological activity. ### Training results Training Loss: 0.9446000 Further evaluation is needed ### Framework versions - Transformers 4.37.0.dev0 - Pytorch 2.1.0+cu121 - Tokenizers 0.15.0