yongzx commited on
Commit
74a8c6d
1 Parent(s): 046f613

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -28,4 +28,5 @@ We finetuned the `wte` and `wpe` layers of GPT-2 (while freezing the parameters
28
  - max_eval_samples: 5000
29
  ```
30
 
31
- Setup: 8 RTX-3090 GPUs, trained for seven days (total training steps: 110500, effective train batch size: 64, tokens per batch: 1024)
 
 
28
  - max_eval_samples: 5000
29
  ```
30
 
31
+ Setup: 8 RTX-3090 GPUs, trained for seven days (total training steps: 110500, effective train batch size: 64, tokens per batch: 1024)
32
+ Final checkpoint: checkpoint-111500