jannisborn commited on
Commit
ccbdbb4
·
unverified ·
1 Parent(s): 779b3f2
Files changed (2) hide show
  1. model_cards/article.md +33 -12
  2. model_cards/description.md +20 -24
model_cards/article.md CHANGED
@@ -1,21 +1,42 @@
1
- # MoLeR -- Documentation
2
 
3
- ## Parameters
4
 
5
- ### Algorithm Version:
6
- Which model checkpoint to use (trained on different datasets).
7
 
8
- ### Scaffolds
9
- One or multiple scaffolds (or seed molecules), provided as '.'-separated SMILES. If empty, no scaffolds are used.
10
 
11
- ### Number of samples:
12
- How many samples should be generated (between 1 and 50).
13
 
14
- ### Beam size
15
- Beam size used in beam search decoding (the higher the slower but better).
16
 
17
- ### Seed
18
- The random seed used for initialization.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## Citation
21
 
 
1
+ ### Model card - MoLeR
2
 
3
+ **Model Details**: MoLeR is a graph-based molecular generative model that can be conditioned (primed) on scaffolds. The model decorates scaffolds with realistic structural motifs.
4
 
5
+ **Developers**: Krzysztof Maziarz and co-authors from Microsoft Research and Novartis (full reference at bottom).
 
6
 
7
+ **Distributors**: Developer's code wrapped and distributed by GT4SD Team (2023) from IBM Research.
 
8
 
9
+ **Model date**: Released around March 2022.
 
10
 
11
+ **Model version**: Model provided by original authors, see [their GitHub repo](https://github.com/microsoft/molecule-generation).
 
12
 
13
+ **Model type**: An encoder-decoder-based GNN for molecular generation.
14
+
15
+ **Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**: Trained by the original authors with the default parameters provided [on GitHub](https://github.com/microsoft/molecule-generation).
16
+
17
+ **Paper or other resource for more information**: Learning to Extend Molecular Scaffolds with Structural Motifs (ICLR 2022).
18
+
19
+ **License**: MIT
20
+
21
+ **Where to send questions or comments about the model**: Open an issue on original author's [GitHub repository](https://github.com/microsoft/molecule-generation).
22
+
23
+ **Intended Use. Use cases that were envisioned during development**: Chemical research, in particular drug discovery.
24
+
25
+ **Primary intended uses/users**: Researchers and computational chemists using the model for model comparison or research exploration purposes.
26
+
27
+ **Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties.
28
+
29
+ **Factors**: Not applicable.
30
+
31
+ **Metrics**: Validation loss on decoding correct molecules. Evaluated on several downstream tasks.
32
+
33
+ **Datasets**: 1.5M drug-like molecules from GuacaMol benchmark. Finetuning on 20 molecular optimization tasks from GuacaMol.
34
+
35
+ **Ethical Considerations**: Unclear, please consult with original authors in case of questions.
36
+
37
+ **Caveats and Recommendations**: Unclear, please consult with original authors in case of questions.
38
+
39
+ Model card prototype inspired by [*Mitchell et al. (2019), Proceedings of the Conference on Fairness, Accountability, and Transparency*](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)
40
 
41
  ## Citation
42
 
model_cards/description.md CHANGED
@@ -1,27 +1,23 @@
1
 
2
  # MoLeR (MOlecule-LEvel Representation)
3
 
4
- <img src="https://raw.githubusercontent.com/GT4SD/gt4sd-core/main/docs/_static/gt4sd_logo.png" alt="logo" width="800">
5
-
6
- ### Model card
7
-
8
- *Model Details*: MoLeR is a graph-based molecular generative model that can be conditioned (primed) on scaffolds. The model decorates scaffolds with realistic structural motifs.
9
- *Developers*: Krzysztof Maziarz and co-authors from Microsoft Research and Novartis (full reference at bottom).
10
- *Distributors*: Developer's code wrapped and distributed by GT4SD Team (2023) from IBM Research.
11
- *Model date*: Released around March 2022.
12
- *Model version*: Model provided by original authors, see:
13
- *Model type*: An encoder-decoder-based GNN for molecular generation.
14
- *Information about training algorithms, parameters, fairness constraints or other applied approaches, and features*: Trained by the original authors with the default parameters provided [on GitHub](https://github.com/microsoft/molecule-generation).
15
- *Paper or other resource for more information*: Learning to Extend Molecular Scaffolds with Structural Motifs (ICLR 2022).
16
- *License*: MIT
17
- *Where to send questions or comments about the model*: Open an issue on original author's [GitHub repository](https://github.com/microsoft/molecule-generation).
18
- *Intended Use. Use cases that were envisioned during development*: Chemical research, in particular drug discovery.
19
- *Primary intended uses/users*: Researchers and computational chemists using the model for model comparison or research exploration purposes.
20
- *Out-of-scope use cases*: Production-level inference, producing molecules with harmful properties.
21
- *Factors*: Not applicable.
22
- *Metrics*: Validation loss on decoding correct molecules. Evaluated on several downstream tasks.
23
- *Datasets*: 1.5M drug-like molecules from GuacaMol benchmark. Finetuning on 20 molecular optimization tasks from GuacaMol.
24
- *Ethical Considerations*: Unclear, please consult with original authors in case of questions.
25
- *Caveats and Recommendations*: Unclear, please consult with original authors in case of questions.
26
-
27
- Model card prototype inspired by [*Mitchell et al. (2019), Proceedings of the Conference on Fairness, Accountability, and Transparency*](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)
 
1
 
2
  # MoLeR (MOlecule-LEvel Representation)
3
 
4
+ <img align="right" src="https://raw.githubusercontent.com/GT4SD/gt4sd-core/main/docs/_static/gt4sd_logo.png" alt="logo" width="80" >
5
+ This model is provided and distributed by the **GT4SD** (Generative Toolkit for Scientific Discovery).
6
+
7
+
8
+ ## Model documentation & parameters
9
+
10
+ ### Algorithm Version:
11
+ Which model checkpoint to use (trained on different datasets).
12
+
13
+ ### Scaffolds
14
+ One or multiple scaffolds (or seed molecules), provided as '.'-separated SMILES. If empty, no scaffolds are used.
15
+
16
+ ### Number of samples:
17
+ How many samples should be generated (between 1 and 50).
18
+
19
+ ### Beam size
20
+ Beam size used in beam search decoding (the higher the slower but better).
21
+
22
+ ### Seed
23
+ The random seed used for initialization.