File size: 3,227 Bytes
7eafb33
 
 
 
 
 
 
 
 
 
 
 
 
04f5fd8
895a807
ccbdbb4
895a807
ccbdbb4
895a807
ccbdbb4
82457c0
ccbdbb4
895a807
ccbdbb4
82457c0
ccbdbb4
 
 
 
fbb3cb6
ccbdbb4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5d27fa6
895a807
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# Model documentation & parameters

**Algorithm Version**: Which model checkpoint to use (trained on different datasets).

**Scaffolds**: One or multiple scaffolds (or seed molecules), provided as '.'-separated SMILES. If empty, no scaffolds are used.

**Number of samples**: How many samples should be generated (between 1 and 50).

**Beam size**: Beam size used in beam search decoding (the higher the slower but better).

**Seed**: The random seed used for initialization.


# Model card

**Model Details**: MoLeR is a graph-based molecular generative model that can be conditioned (primed) on scaffolds. The model decorates scaffolds with realistic structural motifs.

**Developers**: Krzysztof Maziarz and co-authors from Microsoft Research and Novartis (full reference at bottom).

**Distributors**: Developer's code wrapped and distributed by GT4SD Team (2023) from IBM Research.

**Model date**: Released around March 2022.

**Model version**: Model provided by original authors, see [their GitHub repo](https://github.com/microsoft/molecule-generation).

**Model type**: An encoder-decoder-based GNN for molecular generation.

**Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**: Trained by the original authors with the default parameters provided [on GitHub](https://github.com/microsoft/molecule-generation).

**Paper or other resource for more information**: [Learning to Extend Molecular Scaffolds with Structural Motifs (ICLR 2022)](https://openreview.net/forum?id=ZTsoE8G3GG).

**License**: MIT

**Where to send questions or comments about the model**: Open an issue on original author's [GitHub repository](https://github.com/microsoft/molecule-generation).

**Intended Use. Use cases that were envisioned during development**: Chemical research, in particular drug discovery.

**Primary intended uses/users**: Researchers and computational chemists using the model for model comparison or research exploration purposes.

**Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties.

**Factors**: Not applicable.

**Metrics**: Validation loss on decoding correct molecules. Evaluated on several downstream tasks.

**Datasets**: 1.5M drug-like molecules from GuacaMol benchmark. Finetuning on 20 molecular optimization tasks from GuacaMol.

**Ethical Considerations**: Unclear, please consult with original authors in case of questions.

**Caveats and Recommendations**: Unclear, please consult with original authors in case of questions.

Model card prototype inspired by [Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)

## Citation

```bib
@inproceedings{maziarz2021learning,
  author={Krzysztof Maziarz and Henry Richard Jackson{-}Flux and Pashmina Cameron and
    Finton Sirockin and Nadine Schneider and Nikolaus Stiefl and Marwin H. S. Segler and Marc Brockschmidt},
  title     = {Learning to Extend Molecular Scaffolds with Structural Motifs},
  booktitle = {The Tenth International Conference on Learning Representations, {ICLR}},
  year      = {2022}
}
```