Great! Others with 3B/7B/1.5B planned?

by inputout - opened 1 day ago

1 day ago

•

Thank you, a draft collection is great! i have missed exactly that so far, especially for these large models.
In my first test it doesn't give a speed advantage (it always depends a bit on the system) so it would be interesting to have other sizes like a 7B and 3B?
In my observation with other models (72B, 32B) the best draft model sizes were the ones that had a combination of good prediction (not too small like 0.5) and yet small enough not to steal too much resources (like >=14B). Mostly these were 3B and 7B, 1.5B was still partly ok.

inputout changed discussion title from Great! Others with 3B/7B planned? to Great! Others with 3B/7B/1.5B planned? 1 day ago

jukofyork

Owner about 10 hours ago

•

edited about 10 hours ago

Yeah, I'm just trying fine-tuning the 0.5B first and will then test larger sizes. The 0.5B started off with only around 33% top-1 accuracy, so will be interesting to see what it ends up as after fine-tuning on around 500M-1B tokens to compare with.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment