2.2.1 is not better, this is the best version

#4
by Madd0g - opened

2.2 is a magical model, absolutely hands down the best 7b I tried (I'm using GGUF Q4 version through ollama).

It's amazingly steerable, I'm giving it GPT4 level requests, just dumping thought-train in prompts instead of actual proper instructions, but in the end it just understands and does very well in my very demanding and technical tasks.

Especially, it's MUCH better than 2.1 and 2.2.1 at just stopping (I'm doing a lot of zero-shot prompts, so there aren't too many examples). In my tests 2.2.1 gives me good answers too but then keeps on completing what it shouldn't like from the examples or instructions. Sometimes 2.2 prints an errant codeblock at the end or adds an extra sentence, but the problems are so minimal compared to the disasters the same exact conditions evoke from 2.2.1.

I spent all day yesterday working on just 1-2 prompts and I could tell, as I was adding more guidance and instructions that it started understanding what I was trying to get it to do. When the same challenges were given to the other versions, 2.2.1 achieved (almost) comparable results, but then consistently created crap at the end or repeated the instructions back to me. Like they just couldn't stop at the right time. It sounds like a little thing but the difference is so noticeable.

Also multi-turn chat in 2.2 is absolutely from another world, it simply works for very longer contexts and shifting technical challenges (not something I was specifically testing, but I was still very impressed).

I want to see 2.2 on the leaderboard, such a shame it's the only one missing, I really believe in it.
I don't know what's different with this version, but maybe a little overfitting is good?

Whatever you did, please do more of it :)

Cognitive Computations org

Thanks for the update.
This one was 3 epochs I think that's the only difference

Sign up or log in to comment