Probably The Best Multishot Model Series So Far

#1
by HDiffusion - opened

It seems to understand the context far better than any other model that I've tried, even several 33B. Thanks for the conversion.

HDiffusion changed discussion title from Probably the best Multishot model series so far to Probably The Best Multishot Model Series So Far

Great to hear. Are you finding that it won't stop, and keeps answering itself? I found that I had to configure a stopping string in the UI as it doesn't seem to have any stopping token implemented

Great to hear. Are you finding that it won't stop, and keeps answering itself? I found that I had to configure a stopping string in the UI as it doesn't seem to have any stopping token implemented

You mean, it talks too much? 😂

It seems to understand the context far better than any other model that I've tried, even several 33B. Thanks for the conversion.

Hey man can you plz share the code how did you load the model in jupyter file, as I am still learning to it....
Thank you

Great to hear. Are you finding that it won't stop, and keeps answering itself? I found that I had to configure a stopping string in the UI as it doesn't seem to have any stopping token implemented

Yeah I've been using a stopping strings by default since the original Vicuna so I didn't notice.

Yes, model is good. But it hallucinates and looses context too often at scale.
Haven't seen such things happening at all on other 13b models.

What I mean is: Model creates text on "Golang in-memory database" and switches to "Machine learning" like it was talking about ML all the way.
One more thing, all llama-based models I've tried were not really aware of Google as a search engine and preferred Bing. This one is ok with Google.

Same gripe, looks like it has an attractor to switch context to ML. Asked to write interferometry code, it switched to MNIST half-way. Great speed and initial couple thousand tokens though.

Sign up or log in to comment