heegyu commited on
Commit
42a6c43
1 Parent(s): 179e694

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -8,6 +8,9 @@ Hyperparameters
8
  - 1e-4 -> 1e-5 with cosine lr decay
9
  - batch size 128
10
  - max sequence length 2048
 
 
 
11
 
12
  ```
13
  # Load model directly
 
8
  - 1e-4 -> 1e-5 with cosine lr decay
9
  - batch size 128
10
  - max sequence length 2048
11
+ - AdamW(weigth decay=0.01, b1=0.9, b2=0.99, grad_clip=1.0)
12
+ - no warmup
13
+ - BF16
14
 
15
  ```
16
  # Load model directly