---
title: README
emoji: ๐
colorFrom: green
colorTo: yellow
sdk: static
pinned: false
---
# MISATO - Machine learning dataset of protein-ligand complexes for structure-based drug discovery
## ๐ Where we are:
- Quantum Mechanics: 19443 ligands, curated and refined
- Molecular Dynamics: 16972 simulated protein-ligand structures, 10 ns each
- AI: pytorch dataloaders, 2 base line models for MD and QM
## โ๏ธ Vision:
We are a drug discovery community project :hugs:
- highest possible accuracy for ligand molecules
- represent the systems dynamics in reasonable timescales
- innovative AI models for drug discovery predictions
- lets build useful and fun spaces for everyone ๐
Lets crack the **100+ ns** MD, **30000+ protein-ligand structures** and a whole new world of **AI models for drug discovery** together.
[Check out the paper!](https://www.biorxiv.org/content/10.1101/2023.05.24.542082v2)
![Alt text](logo.jpg?raw=true "MISATO")
## ๐ Community
Want to get hands-on for drug discovery using AI?
[Join our discord server!](https://discord.gg/tGaut92VYB)
## ๐ย ย Introduction
You can freely download the **MISATO-dataset** from [Zenodo](https://zenodo.org/record/7711953):
- MD (133 GiB)
- QM (0.3 GiB)
- electronic densities (6 GiB)
- MD restart and topology files (55 GiB)