vumichien commited on
Commit
1cdbe3e
1 Parent(s): c2cec9b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -4
README.md CHANGED
@@ -4,15 +4,23 @@ library_name: keras
4
 
5
  ## Model description
6
 
7
- More information needed
 
 
 
 
 
 
 
 
8
 
9
  ## Intended uses & limitations
10
 
11
- More information needed
12
 
13
  ## Training and evaluation data
14
 
15
- More information needed
16
 
17
  ## Model Plot
18
 
@@ -21,4 +29,10 @@ More information needed
21
 
22
  ![Model Image](./model.png)
23
 
24
- </details>
 
 
 
 
 
 
 
4
 
5
  ## Model description
6
 
7
+ This repo contains the model and the notebook for fine-tuning BERT model on SNLI Corpus for Semantic Similarity. [Drug Molecule Generation with VAE](https://keras.io/examples/generative/molecule_generation/).
8
+
9
+ Full credits go to [Victor Basu](https://www.linkedin.com/in/victor-basu-520958147/)
10
+
11
+ Reproduced by [Vu Minh Chien](https://www.linkedin.com/in/vumichien/)
12
+
13
+ Motivation: Using a Variational Autoencoder to generate molecules for drug discovery. Automatic chemical design using a data-driven continuous representation of molecules generates new molecules via efficient exploration of open-ended spaces of chemical compounds. The model consists of three components: Encoder, Decoder, and Predictor. The Encoder converts the discrete representation of a molecule into a real-valued continuous vector, and the Decoder converts these continuous vectors back to discrete molecule representations. The Predictor estimates chemical properties from the latent continuous vector representation of the molecule. Continuous representations allow the use of gradient-based optimization to efficiently guide the search for optimized functional compounds.
14
+
15
+ ![intro](https://bit.ly/3CtPMzM)
16
 
17
  ## Intended uses & limitations
18
 
19
+ In this example, RDKit is used to conveniently and efficiently transform SMILES into molecule objects, and then from those obtain sets of atoms and bonds. SMILES expresses the structure of a given molecule in the form of an ASCII string. The SMILES string is a compact encoding that, for smaller molecules, is relatively human-readable. Encoding molecules as a string both alleviates and facilitates database and/or web searching of a given molecule. RDKit uses algorithms to accurately transform a given SMILES to a molecule object, which can then be used to compute a great number of molecular properties/features.
20
 
21
  ## Training and evaluation data
22
 
23
+ The ZINC – A Free Database of Commercially Available Compounds for Virtual Screening dataset was used in this tutorial. The dataset comes with molecule formula in SMILE representation along with their respective molecular properties such as logP (water–octanal partition coefficient), SAS (synthetic accessibility score), and QED (Qualitative Estimate of Drug-likeness).
24
 
25
  ## Model Plot
26
 
 
29
 
30
  ![Model Image](./model.png)
31
 
32
+ </details>
33
+
34
+ ## Output samples
35
+
36
+ [Samples](./samples.png)
37
+
38
+ [Latent spaces](./latent_space_clusters.png)