Update README.md
Browse files
README.md
CHANGED
@@ -95,15 +95,15 @@ print(model(inp)) # tensor([[19.1666]], grad_fn=<MulBackward0>)
|
|
95 |
### Training Procedure
|
96 |
|
97 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
98 |
-
We used Open Reaction Database (ORD) dataset for model training.
|
99 |
-
The command used for training is the following. For more information, please refer to the paper and GitHub repository.
|
100 |
|
101 |
```python
|
102 |
python train.py \
|
103 |
-
--train_data_path='
|
104 |
-
--valid_data_path='
|
105 |
-
--test_data_path='
|
106 |
-
--CN_test_data_path='
|
107 |
--epochs=100 \
|
108 |
--batch_size=32 \
|
109 |
--output_dir='./'
|
|
|
95 |
### Training Procedure
|
96 |
|
97 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
98 |
+
We used [Open Reaction Database (ORD) dataset](https://drive.google.com/file/d/1fa2MyLdN1vcA7Rysk8kLQENE92YejS9B/view?usp=drive_link) for model training. In addition, we used palladium-catalyzed Buchwald-Hartwig [C-N cross-coupling reactions dataset](https://yzhang.hpc.nyu.edu/T5Chem/index.html)'s test split to prevent data leakage.
|
99 |
+
The command used for training is the following. For more information about data preprocessing and training, please refer to the paper and GitHub repository.
|
100 |
|
101 |
```python
|
102 |
python train.py \
|
103 |
+
--train_data_path='../data/preprocessed_ord_train.csv' \
|
104 |
+
--valid_data_path='../data/preprocessed_ord_valid.csv' \
|
105 |
+
--test_data_path='../data/preprocessed_ord_test.csv' \
|
106 |
+
--CN_test_data_path='../data/C_N_yield/MFF_Test1/test.csv' \
|
107 |
--epochs=100 \
|
108 |
--batch_size=32 \
|
109 |
--output_dir='./'
|