Any news about v0.3?

#7
by xxx777xxxASD - opened

.

Owner
β€’
edited Apr 26

@xxx777xxxASD
Thank you for the interest!
V0.3 currently in the works! If you want me to drop a third experiment a bit early I can, it still not where I want it; but significantly better with perplexity. Locally I have a version of this model that gets a perplexity of 25 in 4bits vs 75 with v0.2. (That was on a smaller dataset for a pre-training test run) I’m still making the dataset significantly larger and training for a lot longer, to get the results I want.

@Vezora are you still working on v0.3? If not, could you drop a recent checkpoint of it so that others could evaluate and work on it?

Sign up or log in to comment