Best prose in a model I've ever seen.

by Dsol58 - opened Oct 19, 2024

Oct 19, 2024

I have been following along with llms since the llama 2 days and its amazing to see how far they've come. I've always been a big fan of your models and prefer to use larger models around 70b and 123b. I have to say that this version has demonstrated some of the best prose for and above its weight class, even at Min P 0.1. The only drawback of this model being its size and metaknowledge, one perk of llama models has always been its ability to draw from random outside sources even without character prompting, which would be great if it wasnt for its gptisms. If it were ever possible to replicate this with 22b or 123b, I'd be stoked to see it.

TheDrummer

Owner Oct 19, 2024

I have UnslopSmall v1 in my page but I'm not sure if it's as good. Nemo and Small have different archs.

Dsol58

Oct 19, 2024

haven't put my finger on it but yes unslop small v1 leaves much to be desired, if I get the change I could try to compare them.

SerialKicked

Oct 19, 2024

•

edited Oct 19, 2024

I have UnslopSmall v1 in my page but I'm not sure if it's as good. Nemo and Small have different archs.

I tried the official Cydonia versions, the 2k, the 2l, and unslopsmall. To me, Cydonia 2k felt the best, subsequent versions are very noticeably weaker. (all in Q6 precision for reference)
edit: to be fair, I didn't test 2m/unslop as thoroughly as the others yet, at this point it's more a feeling than a fact.

I also feel like not turning Metharme keywords into tokens is really hurting the model. If you look at it, metharme format is very similar to Mistral (with a system token on top) in its structure.
[inst]user input[/inst]model output (/s token)
<|user|>user input<|model|>model output (/s token)
Not leveraging on that fact seems like such a waste during the fine tuning. if I know anything about Mistral's various architectures, it's that they are in love with very consistent formatting.

Joseph717171

Nov 12, 2024

•

edited Nov 12, 2024

Your SMOL model does "the things" extremely well. I don't know what dark magic you have been using, but you have a way with words and training AI, which makes them just somehow soooo much more "helpful"... 😳😏

Joseph717171

Nov 12, 2024

@TheDrummer Any chance you guys can work your magic on SuperNova-Lite? 😋

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment