maojiashun
commited on
Commit
•
c41e83d
1
Parent(s):
914b1be
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,74 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# TransAntivirus
|
2 |
+
|
3 |
+
**Transformer-based molecular generative model for antiviral drug design**
|
4 |
+
|
5 |
+
|
6 |
+
## Acknowledgements
|
7 |
+
We thank the authors of C5T5: Controllable Generation of Organic Molecules with Transformers and IUPAC2Struct: Transformer-based artificial neural networks for the conversion between chemical notations for releasing their code. The code in this repository is based on their source code release (https://github.com/dhroth/c5t5 and https://github.com/sergsb/IUPAC2Struct). If you find this code useful, please consider citing their work.
|
8 |
+
|
9 |
+
|
10 |
+
## Requirements
|
11 |
+
```python
|
12 |
+
Python==3.6
|
13 |
+
pandas==1.1.5
|
14 |
+
numpy==1.19.2
|
15 |
+
pytorch==1.10.0
|
16 |
+
pytorch-mutex==1.0
|
17 |
+
torchaudio==0.10.0
|
18 |
+
torchtext==0.11.2
|
19 |
+
torchvision==0.11.1
|
20 |
+
RDKit==2020.03.3.0
|
21 |
+
transformers==4.18.0
|
22 |
+
```
|
23 |
+
|
24 |
+
https://github.com/rdkit/rdkit
|
25 |
+
|
26 |
+
|
27 |
+
|
28 |
+
## Model & data
|
29 |
+
|
30 |
+
|
31 |
+
|
32 |
+
For the generation stage the model files are available. It is possible to use the ones that are generated during the training step or you can download the ones that we have already generated model files from Google Drive.
|
33 |
+
|
34 |
+
https://drive.google.com/drive/u/0/folders/1T2CuAo52Auryepr9UZOSB1g6U_i332UY
|
35 |
+
|
36 |
+
|
37 |
+
## Generation
|
38 |
+
novel compound generation:
|
39 |
+
|
40 |
+
```python
|
41 |
+
#traindata_new.csv smiles|aLogP|IUPACName
|
42 |
+
|
43 |
+
#finetunev1_new.csv smiles|aLogP|IUPACName
|
44 |
+
|
45 |
+
python gen_t5_real.py --dataset_dir ./download_pubchem/ --vocab_fn ./vocab/iupac_spm.model --dataset_filename ./finetunev1_new.csv > gen_real_fine_tune_non.txt
|
46 |
+
```
|
47 |
+
|
48 |
+
## Model Metrics
|
49 |
+
### MOSES
|
50 |
+
Molecular Sets (MOSES), a benchmarking platform to support research on machine learning for drug discovery. MOSES implements several popular molecular generation models and provides a set of metrics to evaluate the quality and diversity of generated molecules. With MOSES, MOSES aim to standardize the research on molecular generation and facilitate the sharing and comparison of new models.
|
51 |
+
https://github.com/molecularsets/moses
|
52 |
+
|
53 |
+
|
54 |
+
### QEPPI
|
55 |
+
quantitative estimate of protein-protein interaction targeting drug-likeness
|
56 |
+
|
57 |
+
https://github.com/ohuelab/QEPPI
|
58 |
+
|
59 |
+
* Kosugi T, Ohue M. Quantitative estimate index for early-stage screening of compounds targeting protein-protein interactions. International Journal of Molecular Sciences, 22(20): 10925, 2021. doi: 10.3390/ijms222010925
|
60 |
+
Another QEPPI publication (conference paper)
|
61 |
+
|
62 |
+
* Kosugi T, Ohue M. Quantitative estimate of protein-protein interaction targeting drug-likeness. In Proceedings of The 18th IEEE International Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2021), 2021. doi: 10.1109/CIBCB49929.2021.9562931 (PDF) * © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
|
63 |
+
|
64 |
+
|
65 |
+
## License
|
66 |
+
Code is released under GNU AFFERO GENERAL PUBLIC LICENSE.
|
67 |
+
|
68 |
+
|
69 |
+
## Cite:
|
70 |
+
* Jiashun Mao, Jianmin Wang, Amir Zeb, Kwang-Hwi Cho, Haiyan Jin, Jongwan Kim, Onju Lee, Yunyun Wang, and Kyoung Tai No. "Transformer-Based Molecular Generative Model for Antiviral Drug Design" Journal of Chemical Information and Modeling, 2023;, [DOI: 10.1021/acs.jcim.3c00536](https://doi.org/10.1021/acs.jcim.3c00536)
|
71 |
+
|
72 |
+
* Rothchild, Daniel, Alex Tamkin, Julie Yu, Ujval Misra, and Joseph Gonzalez. "C5t5: Controllable generation of organic molecules with transformers." arXiv preprint arXiv:2108.10307 (2021).
|
73 |
+
|
74 |
+
* Jianmin Wang, Yanyi Chu, Jiashun Mao, Hyeon-Nae Jeon, Haiyan Jin, Amir Zeb, Yuil Jang, Kwang-Hwi Cho, Tao Song, Kyoung Tai No, De novo molecular design with deep molecular generative models for PPI inhibitors, Briefings in Bioinformatics, 2022;, bbac285, https://doi.org/10.1093/bib/bbac285
|