Great! Others with 3B/7B/1.5B planned?

#1
by inputout - opened

Thank you, a draft collection is great! i have missed exactly that so far, especially for these large models.
In my first test it doesn't give a speed advantage (it always depends a bit on the system) so it would be interesting to have other sizes like a 7B and 3B?
In my observation with other models (72B, 32B) the best draft model sizes were the ones that had a combination of good prediction (not too small like 0.5) and yet small enough not to steal too much resources (like >=14B). Mostly these were 3B and 7B, 1.5B was still partly ok.

inputout changed discussion title from Great! Others with 3B/7B planned? to Great! Others with 3B/7B/1.5B planned?

Yeah, I'm just trying fine-tuning the 0.5B first and will then test larger sizes. The 0.5B started off with only around 33% top-1 accuracy, so will be interesting to see what it ends up as after fine-tuning on around 500M-1B tokens to compare with.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment