Update README.md (#3)

Browse files

- Update README.md (bbb4e2cb34572a3a182efc2468bf518e6b070c25)

Co-authored-by: Heloise Chomet <heloise-chomet@users.noreply.huggingface.co>

Files changed (1) hide show

README.md +64 -1

README.md CHANGED Viewed

@@ -1,3 +1,66 @@
 ## License summary
 1. The Licensed Models are **only** available under this License for Non-Commercial Purposes.
@@ -6,4 +69,4 @@
     1. any Commercial Purposes, unless agreed by Us under a separate licence;
     2. to train, improve or otherwise influence the functionality or performance of any other third-party derivative model that is commercial or intended for a Commercial Purpose and is similar to the Licensed Models;
     3. to create models distilled or derived from the Outputs of the Licensed Models, unless such models are for Non-Commercial Purposes and open-sourced under the same license as the Licensed Models; or
-    4. in violation of any applicable laws and regulations.

+# NequIP
+## Reference
+Simon Batzner, Albert Musaelian, Lixin Sun, Mario Geiger, Jonathan P. Mailoa,
+Mordechai Kornbluth, Nicola Molinari, Tess E. Smidt, and Boris Kozinsky.
+E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials.
+Nature Communications, 13(1), May 2022. ISSN: 2041-1723. URL: https://dx.doi.org/10.1038/s41467-022-29939-5.
+## How to Use
+For complete usage instructions, please refer to our [documentation](https://instadeep.github.io/mlip)
+## Model architecture
+| Parameter                 | Value                                         | Description                                 |
+|---------------------------|-----------------------------------------------|---------------------------------------------|
+| `num_layers`              | `5`                                           | Number of NequIP layers.                    |
+| `node_irreps`             | `64x0e + 64x0o + 32x1e + 32x1o + 4x2e + 4x2o` | O3 representation space of node features.   |
+| `l_max`                   | `2`                                           | Maximal degree of spherical harmonics.      |
+| `num_bessel`              | `8`                                           | Number of Bessel basis functions.           |
+| `radial_net_nonlinearity` | `swish`                                       | Activation function for radial MLP.         |
+| `radial_net_n_hidden`     | `64`                                          | Number of hidden features in radial MLP.    |
+| `radial_net_n_layers`     | `2`                                           | Number of layers in radial MLP.             |
+| `radial_envelope`         | `polynomial_envelope`                         | Radial envelope function.                   |
+| `scalar_mlp_std`          | `4`                                           | Standard deviation of weight initialisation.|
+| `atomic_energies`         | `None`                                        | Treatment of the atomic energies.           |
+| `avg_um_neighbors`        | `None`                                        | Mean number of neighbors.                   |
+For more information about NequIP hyperparameters,
+please refer to our [documentation](https://instadeep.github.io/mlip/api_reference/models/nequip.html#mlip.models.nequip.config.NequipConfig)
+## Training
+Training is performed over 220 epochs, with an exponential moving average (EMA) decay rate of 0.99.
+The model employs a Huber loss function with scheduled weights for the energy and force components.
+Initially, the energy term is weighted at 40 and the force term at 1000.
+At epoch 115, these weights are flipped.
+We use our default MLIP optimizer in v1.0.0 with the following settings:
+| Parameter                        | Value          | Description                                                     |
+|----------------------------------|----------------|-----------------------------------------------------------------|
+| `init_learning_rate`             | `0.002`        | Initial learning rate.                                          |
+| `peak_learning_rate`             | `0.002`        | Peak learning rate.                                             |
+| `final_learning_rate`            | `0.002`        | Final learning rate.                                            |
+| `weight_decay`                   | `0`            | Weight decay.                                                   |
+| `warmup_steps`                   | `4000`         | Number of optimizer warm-up steps.                              |
+| `transition_steps`               | `360000`       | Number of optimizer transition steps.                           |
+| `grad_norm`                      | `500`          | Gradient norm used for gradient clipping.                       |
+| `num_gradient_accumulation_steps`| `1`            | Steps to accumulate before taking an optimizer step.            |
+For more information about the optimizer,
+please refer to our [documentation](https://instadeep.github.io/mlip/api_reference/training/optimizer.html#mlip.training.optimizer_config.OptimizerConfig)
+## Dataset
+| Parameter                   | Value | Description                                |
+|-----------------------------|-------|--------------------------------------------|
+| `graph_cutoff_angstrom`     | `5`   | Graph cutoff distance (in Å).              |
+| `max_n_node`                | `32`  | Maximum number of nodes allowed in a batch.|
+| `max_n_edge`                | `288` | Maximum number of edges allowed in a batch.|
+| `batch_size`                | `16`  | Number of graphs in a batch.               |
+This model was trained on the [SPICE2_curated dataset](https://huggingface.co/datasets/InstaDeepAI/SPICE2-curated).
+For more information about dataset configuration
+please refer to our [documentation](https://instadeep.github.io/mlip/api_reference/data/dataset_configs.html#mlip.data.configs.GraphDatasetBuilderConfig)
 ## License summary
 1. The Licensed Models are **only** available under this License for Non-Commercial Purposes.
     1. any Commercial Purposes, unless agreed by Us under a separate licence;
     2. to train, improve or otherwise influence the functionality or performance of any other third-party derivative model that is commercial or intended for a Commercial Purpose and is similar to the Licensed Models;
     3. to create models distilled or derived from the Outputs of the Licensed Models, unless such models are for Non-Commercial Purposes and open-sourced under the same license as the Licensed Models; or
+    4. in violation of any applicable laws and regulations.