pt-sk's picture
Update 2nd_epoch/notes.txt
bf6512b verified
raw
history blame
80 Bytes
from 3rd_ckpt the model was trained using 20million tokens not 30million tokens.