Update README.md
Browse files
README.md
CHANGED
@@ -24,4 +24,4 @@ We finetuned the `wte` and `wpe` layers of GPT-2 (while freezing the parameters
|
|
24 |
- evaluation_strategy: "steps"
|
25 |
- max_eval_samples: 5000
|
26 |
```
|
27 |
-
**Training details**: total training steps: 688000, effective train batch size per step:
|
|
|
24 |
- evaluation_strategy: "steps"
|
25 |
- max_eval_samples: 5000
|
26 |
```
|
27 |
+
**Training details**: total training steps: 688000, effective train batch size per step: 32, max tokens per batch: 1024)
|