One to watch out for

#1
by Nitral-AI - opened
The Chaotic Neutrals org
β€’
edited Apr 23

Apparently, this one is doing pretty well over at chai verse: In a very similar manner and score to nyanade-stunna-maid.
@Lewdiculous

Alright!

So far I'm struggling with L3s initial issues with inconsistent response formatting (since I'm very OCD about that) and just not playing nice with the whole...

Stat 1: Stat...
Stat 2: Stat...
Stat 3: Stat...

...thing in characters that have it. It's very bad with that specially in the L3s.

But I'm coping with each new model. Won't lose hope, when they work they work super well.

I'll check it out next time.

I also think I'm having issues with tokenizer, using ChatML, v0.4 quant'ed with default settings was not stopping as expected, well, se stopped, but it continues to try to generate more and more without writing anything, as if the "stop" queue was never interpreted properly.

The Chaotic Neutrals org

Alright!

So far I'm struggling with L3s initial issues with inconsistent response formatting (since I'm very OCD about that) and just not playing nice with the whole...

Stat 1: Stat...
Stat 2: Stat...
Stat 3: Stat...

...thing in characters that have it. It's very bad with that specially in the L3s.

But I'm coping with each new model. Won't lose hope, when they work they work super well.

I'll check it out next time.

I also think I'm having issues with tokenizer, using ChatML, v0.4 quant'ed with default settings was not stopping as expected, well, se stopped, but it continues to try to generate more and more without writing anything, as if the "stop" queue was never interpreted properly.

None of my models use chatml for now. Just l3 template fam.

I was hallucinating because of Dolphin. Oops. Still, formatting OCD continues with either. Rip me.

Ah yeah, something might come out of this:

https://huggingface.co/mattshumer/Llama-3-8B-16K - base.

The Chaotic Neutrals org
β€’
edited Apr 23

I was hallucinating because of Dolphin. Oops. Still, formatting OCD continues with either. Rip me.

Ah yeah, something might come out of this:

https://huggingface.co/mattshumer/Llama-3-8B-16K - base.

Base and instruct needled to 32k 95% of the time, seems kind of unnecessary.

this was our worst needle result:

image.png

Sign up or log in to comment