Pretraining 33B 8K

#1
by windprak - opened

Could you provide any info on how the pertaining on 8k was done? Like a link to the repository which was used to pretrain the additional 100 steps on 8k context?

Sign up or log in to comment