File size: 284 Bytes
b3451cf
 
 
 
 
 
3e7ee84
b3451cf
 
1
2
3
4
5
6
7
8
9
10
---
license: mit
---

Data: c4 and codeparrot, about 1:1 sample-wise but 1:4 token-wise mix. Significantly biased for codes (python, go, java, javascript, c, c++).
Params: 
- batch size 64 * 2048 * 8 = 1048576 tokens
- lr automatically according to EAI sae codebase
- auxk_alpha 0.03