On some promts, medium is worse than mini&small?

by urtuuuu - opened May 23, 2024

May 23, 2024

•

edited May 23, 2024

"Today I own 3 cars but last year I sold 2 cars. How many cars do I own today?"
How is it possible that the 'medium' version often fails at this question, while even the 'mini' version gets it right? (and 'small' too)
It alwost always gives wrong answer: 1 , while other two say: 3

bartowski

Owner May 24, 2024

are you using quants for the others as well? does small have quant support?

That's strange either way

urtuuuu

May 24, 2024

are you using quants for the others as well? does small have quant support?

https://ai.azure.com/explore/models?selectedCollection=phi
Here are all the models, you can test each on the right, under "Try it out". I also tested q4km ggufs for mini and medium locally, and get same results.

urtuuuu changed discussion status to closed May 27, 2024

bartowski

Owner May 27, 2024

If it's full weights for all of them and they're still different outputs that's super strange!

Didn't mean to ignore this, just got lost lol

urtuuuu

May 27, 2024

•

edited May 27, 2024

Didn't mean to ignore this, just got lost lol

no, i'm just not sure myself anymore. Because at first i was sure about what's in my first message. But now it seems to answer the question correctly... most of the time.
And btw, sorry for another question, but i just can't figure out why phi models, like this one, only generate text up to around 2500/4096 context and then stop, or just generate nonsense?(instruct mode) I think kobolt.cpp says something like "EOS token triggered!". Same in lm studio or oobabooga.

urtuuuu changed discussion status to open May 27, 2024

bartowski

Owner May 28, 2024

That does seem curious.. If you have a prompt that triggers it reliable let me know but I'll try to see if I can see it too. If it's happening on multiple platforms that does seem odd..

I assume this doesn't apply to any hosted full weight versions?

urtuuuu

May 28, 2024

•

edited May 28, 2024

No special promt. Just tell it to write some stories or something so it reaches ~2500 context length.
I've been experiencing this since the first day phi-3 came out, and I have no idea why, it seems like I'm the only one, because nobody talks about it. Only phi models do this.

ishanparihar

Jul 15, 2024

@urtuuuu It is happening with me too. Messed up garbage after 2500 token. Using Q5KM. Trying to change quants.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment