What is the difference between this model and the regular distilled 14 billion parameter model?

#1
by kyars - opened

Am I correct in saying that this model is just better at fine-tuning more efficiently than the standard distilled version?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment