torchdrug / model_cards /article.md
jannisborn's picture
update
d3ae257 unverified

A newer version of the Gradio SDK is available: 4.40.0

Upgrade

Model documentation & parameters

Algorithm: Which model to use (GCPN or GraphAF).

Algorithm Version: Which model checkpoint to use (trained on different datasets).

Number of samples: How many samples should be generated (between 1 and 50).

Model card -- GCPN

Model Details: GCPN is a graph-based molecular generative model that can be optimized with RL for goal-directed graph generation.

Developers: Jiaxuan You, Bowen Liu and co-authors from Stanford.

Distributors: Code provided by TorchDrug developers, wrapped and distributed by GT4SD Team (2023) from IBM Research.

Model date: Published in 2018.

Model version: Models trained by GT4SD team on the tasks provided by TorchDrug repo (see their tutorial).

  • ZINC_250k: 250,000 drug-like molecules with a maximum atom number of 38, taken from ZINC.
  • QED: ZINC dataset, but the model was optimized with Proximal Policy Optimization (PPO) to generate molecules with high QED scores.
  • pLogP: ZINC dataset, but the model was optimized with Proximal Policy Optimization (PPO) to generate molecules with high pLogP scores.

Model type: A graph-based molecular generative model that can be optimized with RL for goal-directed graph generation.

Information about training algorithms, parameters, fairness constraints or other applied approaches, and features: Default parameters as provided in (TorchDrug tutorial).

Paper or other resource for more information: Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation (NeurIPS 2018).

License: TorchDrug: Apache-2.0 license.

Where to send questions or comments about the model: Open an issue on TorchDrug repository or ask original authors.

Intended Use. Use cases that were envisioned during development: Chemical research, in particular drug discovery.

Primary intended uses/users: Researchers and computational chemists using the model for model comparison or research exploration purposes.

Out-of-scope use cases: Production-level inference, producing molecules with harmful properties.

Factors: Not applicable.

Metrics: Validation loss on decoding correct molecules.

Datasets: 250,000 drug-like molecules from ZINC (with a maximum atom number of 38).

Ethical Considerations: Unclear, please consult with original authors in case of questions.

Caveats and Recommendations: Unclear, please consult with original authors in case of questions.

Model card prototype inspired by Mitchell et al. (2019)

Citation

@article{you2018graph,
  title={Graph convolutional policy network for goal-directed molecular graph generation},
  author={You, Jiaxuan and Liu, Bowen and Ying, Zhitao and Pande, Vijay and Leskovec, Jure},
  journal={Advances in neural information processing systems},
  volume={31},
  year={2018}
}

Model card -- GraphAF

Model Details: GraphAF is a flow-based autoregressive graph molecular generative model that can be optimized with RL for goal-directed graph generation.

Developers: Chence Shi, Minkai Xu and co-authors from Peking and Shanghai University and MILA.

Distributors: Code provided by TorchDrug developers, wrapped and distributed by GT4SD Team (2023) from IBM Research.

Model date: Published in 2020.

Model version: Models trained by GT4SD team on the tasks provided by TorchDrug repo (see their tutorial).

  • ZINC_250k: 250,000 drug-like molecules with a maximum atom number of 38, taken from ZINC.
  • QED: ZINC dataset, but the model was optimized with Proximal Policy Optimization (PPO) to generate molecules with high QED scores.
  • pLogP: ZINC dataset, but the model was optimized with Proximal Policy Optimization (PPO) to generate molecules with high pLogP scores.

Model type: A flow-based autoregressive graph molecular generative model that can be optimized with RL for goal-directed graph generation.

Information about training algorithms, parameters, fairness constraints or other applied approaches, and features: Default parameters as provided in (TorchDrug tutorial).

Paper or other resource for more information: GraphAF: a flow-based autoregressive model for molecular graph generation (ICLR 2020).

License: TorchDrug: Apache-2.0 license.

Where to send questions or comments about the model: Open an issue on TorchDrug repository or ask original authors.

Intended Use. Use cases that were envisioned during development: Chemical research, in particular drug discovery.

Primary intended uses/users: Researchers and computational chemists using the model for model comparison or research exploration purposes.

Out-of-scope use cases: Production-level inference, producing molecules with harmful properties.

Factors: Not applicable.

Metrics: Validation loss on decoding correct molecules.

Datasets: 250,000 drug-like molecules from ZINC (with a maximum atom number of 38).

Ethical Considerations: Unclear, please consult with original authors in case of questions.

Caveats and Recommendations: Unclear, please consult with original authors in case of questions.

Model card prototype inspired by Mitchell et al. (2019)

Citation

@inproceedings{shi2020graphaf,
  author    = {Chence Shi and Minkai Xu and Zhaocheng Zhu and Weinan Zhang and Ming Zhang and Jian Tang},
  title     = {GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation},
  booktitle = {International Conference on Learning Representations, {ICLR} 2020},
  year      = {2020},
  url       = {https://openreview.net/forum?id=S1esMkHYPr}
}