Text Generation
Transformers
Safetensors
llama
conversational
Inference Endpoints
text-generation-inference

Hey, got interesting probem here.

#6
by MateoTeo - opened

First of all - amazing model! It generates more interesting stories than the 70b+ models that I have tried. But yeah, I love to use v0.2 more as this one is a bit dry :)
So, the problem: generation freaks out after 4k tokens at max 8k that I used. It starts as intended just to break it soon after, sometimes in a half-word or after the period symbol without space after that, and generates a new scene. The same characters and story, but a totally random scene with them.

A little example: John and Sara walked on the streets, admiring this cozy night.Sara surfs on the giant wave under the bright sun while John films it with a drone from above... (blah-blah-blah, continues with this new generation.)

I tried different settings, quants (from gguf q2_k to q5_k_m, + from different ppl who made them), soft to use it (only LM Studio, only Koboldcpp, Silly Tavern as front), place it all in the ram, but this problem is still here and consistent. v0.4, v0.3, v0.2, that Smaug-34b fine-tune... maybe this is yi-34b problem in general, but I didn't find such info online. Also, I didn't touch the rope settings, and all other models work fine for me.

Well... something like that. Sorry if this problem is well-known*

I'm re-tuning this model as v0.5 right now, should be available in the next few days. The new tune should be much less dry in theory because I re-added the entire subset of cinematika data to it (one of the main differences between v0.2 and v0.4 was a big reduction in this dataset). For long context, I'm also using the updated base yi-34b-200k model which 01-ai claims has much better long-context functionality. We shall see.

In the meantime, I would suggest trying out https://huggingface.co/jondurbin/airoboros-34b-3.2

I'm re-tuning this model as v0.5 right now, should be available in the next few days. The new tune should be much less dry in theory because I re-added the entire subset of cinematika data to it (one of the main differences between v0.2 and v0.4 was a big reduction in this dataset). For long context, I'm also using the updated base yi-34b-200k model which 01-ai claims has much better long-context functionality. We shall see.

In the meantime, I would suggest trying out https://huggingface.co/jondurbin/airoboros-34b-3.2

I loved v0.2, thanks for working on a closer yet better version for 0.5, looking forward to it. The cinematika dataset is amazing, by the way, I can see it being the secret sauce.

Sign up or log in to comment