Pretraining 33B 8K
#1
by
windprak
- opened
Could you provide any info on how the pertaining on 8k was done? Like a link to the repository which was used to pretrain the additional 100 steps on 8k context?
Could you provide any info on how the pertaining on 8k was done? Like a link to the repository which was used to pretrain the additional 100 steps on 8k context?