What is this model exactly?

#1
by vdruts - opened

I did it because it was requested here: https://www.reddit.com/r/LocalLLaMA/comments/14cabi9/any_help_converting_an_interesting_bin_model_to_4/

The guy who requested it said:

It's a pair of pythia models (neox) that were trained using tens of thousands of dollars worth of compute with a massive corpus of storytelling and roleplaying interactive fiction style content. The larger of the two is basically comparable to CLIO (NovelAI's new 3b bot), and has similar qualities.

(The "larger of the two" is the one I've done, 6.9B. I didn't do the smaller one.)

I haven't verified that info for myself.

It's not an instruct model, it's a writing model. It has a lot of built in tricks similar to what is used in NovelAI, such as training with ATTG tags, like this:

[ Author: Haven Ross; Title: The Biker's Buddy; Tags: friendship, slice of life, funny; Genre: thriller ]

Or try something like...

[Style: text adventure]
The Deepest Dungeon, a Text Adventure

Then set ">" as a stop token and use ">your action here" (no quotations)

That will control the action and let you delve the dungeon.

Also works as an effective chat model because of its training, as long as you have good characters.

It's an uncensored model, specifically trained on a wide range of concepts.

Great, thank you for the details. I will put this in the READMEs.

If it's actually like Clio its context size should be 8192 tokens.

It's not Clio, it's "like" clio. This is based on a pythia deduped model, trained to 4096 context as I said. It does NOT have 8192 token context. Clio is built on an entirely new foundational model, this is built on an existing pythia deduped model.
Different, but similar.

Sign up or log in to comment