Questions regarding Stack v2 and StarCoder v2

#111
by aditya2211 - opened

Hi BigCoders,

I had a few questions around Stack v2 and StarCoder v2:
(a) When can we expect the remaining Stack v2 data (documentation etc.) to be released?
(b) For StarCoder v2 pretraining, what was the policy used for packing and chunking? Were documents chunked into multiple segments during pretraining? If so, was there some overlap maintained between chunks?

Sign up or log in to comment