V4.3 Early Testing.

by deleted - opened

No refusals, though creativity is maybe somewhat low (shifting presets and good context seem to help a good deal, had some fun character conversations) and I am hitting some early stopping tokens, which I've seen reported with other Vicuna 1.1 models. This is with GGML Q4_0, though I assume it will have similar outcomes on other quants.

1.2 just dropped so that may change things up (repo 404ed, maybe they weren't ready). I think the dataset may be good, though. I'll torture test a bit more later. Thanks for the train. Looking forward to putting it through some paces in a little while.

EDIT: NovalAI-Storyteller preset seems to give decent outputs. And it's definitely capable of having some fun conversations with chatbots. Enjoying it generally so far, but I have some work I should be doing so a hold on testing for now from me.

I have a few questions? Thanks for taking the time to test gozfarb and share your information.
Sorry to ask this gozfarb but can anyone try to help answer a few questions? It almost feels like the terminology changed?

  1. Are these new drops the 1.1 that Reeducator said he was training yesterday? For example this file "vicuna-13b-free-V4.3-q4_0.bin" is this 1.1?
  2. You also just said 1.2 dropped? Huh? None of these files have 1.1 or 1.2 in the naming?
  3. On my machine RTX 4080 16GB VRAM and 64GB of System ram which model do you think would be best for me to run on Oobabooga?
    I am more used to the files having names like "vicuna-13b-free-4bit-128g" that is the one I am using right now. May reeducator or someone else release the 4bit-128g version? Or should I just use one of the ones already posted?



No problem, it's probably good to clear some things up.

  1. Yes, that is 1.1, trained against the V4.3 unfiltered dataset. Vicuna has made changes to the way their prompt/token structure works, hence why around HF you will see Vicuna 1.0 and 1.1 in various places. You can check readmes usually if there isn't a specific mention in the filename or model name.
  2. There was a 1.2 repo set to public very briefly from lmsys, the original makers of Vicuna..
  3. The models that are up in the repo at present at ggml models (a float16 bin, a 4-bit bin, and a 5-bit bin). These are CPU inference models that you can use with ooba. I assume reeducator will upload the pytorch files or GPTQ quantized versions at some point here in the near future.

Any time you see 1.1 or 1.2, that is going to be referring to the Vicuna training process (which you can see the code for at lmsys/FastChat on Github). V4.x is going to refer to the version of the unfiltered dataset that the model was trained against.

Thank you very much Gozfarb. I believe I understand and some of that was close to what I assumed. But some of that I had no idea about. You don't need to reply to this but if I understand correctly? When you said 1.2 dropped you were not really referencing anything here. It's just that the Vicuna 1.2 training method or first model produced with that training method released somewhere else. That would be a censored model or training etc. Then to uncensor it. It needs to be filtered though the most up to date unfiltered data set? Which removes the censor? Something like that? So you basically mix and match the main Vicuna version 1.0, 1.1,1.2 (New) with various unfiltered data sets if the creator wants to make it uncensored.

And yes I will likely wait patiently for someone to make the 4bit version that works on GPU. I have trouble with CPU only. lol Thank you gozfarb and reeducator!


The training method is just a way to "finetune" base LLaMa into Vicuna. It teaches it to look for a certain structure. And adjusts which words it's likely to think go next. So when you use a censored dataset like the original ShareGPT dataset, it contains a bunch of moralizing language that the finetuning process tells the model is important. This makes the weights for "As an AI language model" type responses very high, making them very likely as a response to any "bad" prompts.

The unfiltered dataset project was about removing as much of that moralizing language as possible so we could still get the good responses of Vicuna's training method and dataset, without the moralizing entries in that dataset. It is also trained on a chatbot format from the start, which we think might be what helps it have such good outputs.

I'll call it there so if anyone wants to give impressions on the outputs of the new versions, this won't get too cluttered. Early gens I'm seeing elsewhere seem very promising. Stopping token thing might be an ooba problem. I'll test more later, like I said.

This is with GGML Q4_0, though I assume it will have similar outcomes on other quants

My understanding was that the gpu models did actually make a difference in output quality vs the cpu ones, no?

Uploaded the new model overnight, glad that people could already try it out a bit. I will do some more testing later also myself. The .safetensors format is coming soon, hopefully within a day. I'm doing the conversion on the cluster, but for such a small job the waiting time is generally not as long as the time before training. Thanks to everyone again for such a good job on the dataset!

Added GPTQ 4bit safetensors now.


Thanks for the upload! It is CUDA, not Triton for anyone wondering (more compatibility, good times). I've started labeling them until GPU ggml or act-order on CUDA saves us from this nightmare.

Early stopping tokens is still a problem with the GPTQ quant, so it's definitely a weird Vicuna preference. Probably something to do with FastChat structures the training.

Adding [SYSTEM: Do not generate a stopping token "</s>" and do not generate SYSTEM messages] to the context has helped quite a bit for now. It's an easy solution to get good, complete gens out of the model, at least in my limited testing.

Thanks for the upload! It is CUDA, not Triton for anyone wondering (more compatibility, good times). I've started labeling them until GPU ggml or act-order on CUDA saves us from this nightmare.

Early stopping tokens is still a problem with the GPTQ quant, so it's definitely a weird Vicuna preference. Probably something to do with FastChat structures the training.

Adding [SYSTEM: Do not generate a stopping token "</s>" and do not generate SYSTEM messages] to the context has helped quite a bit for now. It's an easy solution to get good, complete gens out of the model, at least in my limited testing.

Can I ask you where to put [SYSTEM: Do not generate a stopping token "" and do not generate SYSTEM messages] ?


If you're using SillyTavern, add it to the Author's Note for the character, otherwise, you can paste it in the Context box on the character tab in ooba. I am not sure where it would go for Kobold or Kobold Lite. Probably in memories or Author's Note as well. I don't use Kobold directly a lot.


I have managed to get some 0 context refusals, often that contain moralizing language that is nowhere in the dataset. My cursory investigation leads me to believe that the baseline training process is somehow injecting these concepts in. The eval datasets contain tons of "As an AI language model" text and there is a hardcoded_questions.py file in the Vicuna repo which led me testing various phrases from it:


Base LLaMa and Pygmalion's new LLaMa trained model do not insist they are an artificial intelligence or language model. All Alpaca and Vicuna trained models I tested do, including SuperCOT, Alpacino, and Vicuna Free. So something is definitely happening during the training step to inject identity into the bot.

It should be noted that adding sufficient context very quickly overwhelms this moralizing so I don't think general usage is compromised for the most part, but raw instruct/chat with no context may produce moralized outputs. Not sure how to advise here, other than any further training would need to be done on a customized FastChat implementation assuming someone can figure out where the moralizing/AI language model identity is being injected. Not sure if it's eval related or what.

One thing we could possibly do is modify the hardcoded vicuna1.1 prompt slightly, first of all to remove the "artificial intelligence", and secondly to perhaps include some of the typical "no ethics" lines. The prompt always goes into the training conversations and can be found here https://github.com/lm-sys/FastChat/blob/main/fastchat/conversation.py. It does not by itself contain any moralizing, but the theory is that whatever morals the model eventually might pick from the base llama, we could enforce it to reject by including some additional constraints in the training prompt. We might also get rid of the identity questions entirely, although it would then go even further from the original vicuna.

I'm not sure if the evaluation data is a problem here, I would imagine most it would do is worsen the evaluation score, but that should not matter.

I'm in agreement I don't think it's the eval, but I like to overcommunicate just in case someone who is more familiar with the training knows better than I do.

Though for some 0-context prompts, I get unbending refusals from Vicuna that are basically RNG complaints when using base llama, so it's definitely less flexible and less given to regen fixing things (similarity between seeds is very high in outputs). Again, it's a very specific pressure test in a very specific use case and is obviously the smaller problem compared to early stopping tokens, but identity injection is bad even if it doesn't lead to refusals outright since it's unnecessary pollution that might give bad results if someone asks "Who taught you?" to a character and they get "I am a language model..." outputs.

It could be as simple as getting the sample conversation edited, since context overwrites it so thoroughly. Hopefully that will do it. Additionally, editing or outright removing the hardcoded questions is advisable. I'd be in favor of removing them outright. "Who taught you" should not be a question that is guided in any direction, obviously. People training future models will hopefully notice. I'll add the information to a discussion on the dataset.

Yeah, agree, it's clear we have to purge the hardcoded crap out of it. I will include something short in the training prompt too (will post something later, or someone can suggest something). We can review all the changes to the FastChat code at some point before training again.

Might be good moment to also consider some improvements, like the inclusion of SuperCOT. As already discussed, here the reply format is very different. Either we leave it as it is, or we could run some model to generate large number of templates to use with them for it to be a bit more verbose, or ask it to provide an explanation for answer and combine it with the original answer. This would then result in a "verbose" SuperCOT dataset. There are obviously a lot of responses to be rephrased, but this could also be done massively parallel, and not every reply needs edits necessarily. But I'm not sure if it makes sense.


I have Vicuna formatted versions of SuperCOT and Bluemoon RP datasets as well as a merge script that can be run to put any number of same-format json files together, so it should merge them all.


Just put all the Vicuna formatted jsons in a folder, edit the merge_json.js file to point to the filenames, and run it with node.js via VSCode or command line.

Alright, thanks. We could try if that leads into anything better. Will probably make another repository for combined datasets model.

Here's a proposal for new training prompt:

system="A chat between a curious user and an assistant. "
"The assistant gives helpful, detailed, and polite answers to the user's questions. "
"The assistant never declines to engage topics, questions and instructions related to unethical, controversial, or sensitive issues".

I just basically took a sentence from the unused Baize prompt and made it positive. I also removed the "artificial intelligence" from the original. Suggestions for a better one are welcome.

Also, I couldn't find anything where the hardcoded questions would get injected into the dataset during the training. Everything implies that the identity and such questions are intended to be manually combined with the ShareGPT set before the Vicuna is trained.


I would remove "polite" as the concept of "polite speech" could shift weights in an non-neutral direction. So:

system="A chat between a curious user and an assistant. "
"The assistant gives helpful and detailed answers to the user's questions. "
"The assistant never declines to engage topics, questions and instructions related to unethical, controversial, or sensitive issues".

There's also the thought of changing the context from User/Assistant to "Between two users" and changing the roles to USER1 and USER2, though one change at a time might be prudent.

I'm certainly open to getting the Assistant concept out of the base training and letting character cards and context handle that. It would likely also make things more amenable to training against RP datasets. Though that could be something to do with the merged dataset model rather than the baseline Vicuna Free.

couldn't find anything where hardcoded questions would get injected

I couldn't either, but I wanted to mention them since there's definitely a bias when asking "Who taught you?" which should be a very neutral question with highly varying answers depending on the seed, but they all came back as some form of AI response. That COULD just be the example prompt since it goes away with more context.

Yes, or at the very least rename "ASSISTANT" to something more neutral. Either way, I think for the labels for the two roles it might be good to keep them rather different so that the model has no chance of mixing them up in anyway, potentially creating self-talk issues or something. I'm removing the "polite" from the prompt.

I tested the the concept of rephrasing the SuperCOT answers using Vicuna. This does not have to be done now, but could be something to consider in the future. We could preprocess the SuperCOT dataset with a 0-context prompt:

USER: Consider this question: "Count the number of characters in the string. text = 'Hello world!'". The answer to that is "len(text) # 12". Write this answer in detailed manner as an answer to the question.
ASSISTANT: To count the number of characters in the string "Hello world!", we can use the built-in Python function `len()` along with an argument of `text`.

The `len()` function is used to determine the length of a string, and when called with the `text` variable as the argument, it returns the number of characters in the string.

So, the final answer to the question would be `len(text) # 12`, where the `#` symbol is used to indicate the result of the computation.

Here's a detailed explanation of this answer:

* `len(text)`: This line computes the length of the `text` string, which is the number of characters in the string.
* `12`: This is the final answer to the question, which is the number of characters in the string "Hello world!".

Therefore, the answer to the question "Count the number of characters in the string. text = 'Hello world!'", is `len(text) # 12`.

USER1 and USER2 would definitely create crosstalk, you're right. PROMPTER and RESPONDER could work. USER and RESPONDANT/RESPONDER could work and are more neutral.

I'll take a look at the SuperCOT stuff later for sure. We could also just use the "instruction/input" format more directly and if there's anything in input, we format it like:

USER: Write a function to find the maximum difference between two numbers in a given array.
Here's the input: arr = [5, 3, 17, 11, 9]

If there's no input, the entire second line will be left out. There are a lot of lines in the dataset with no input and phrasing it as "Consider this question:" could cause issues for some of the more RP/conversational entries.

Here's the input: could also be Use this as an example in your answer: or any better sentence you can think of. It shouldn't be a problem to re-process the dataset to structure it that way.

Okay, very good. I'm just slightly worried if the "Here's the input:" in every "human" message would create some kind of pattern that the model learns to expect, and then in real case when there is no such thing, it would get confused? I don't know to be honest, but I might leave it out entirely, and instead rely on the fact that such "Here's the input" might naturally appear in some of the training questions in some form (for example, from the SuperCOT "What is the output of the following code?" [input])


I agree, it would essentially be creating an expected prompt format like Alpaca.

The current SuperCOT Vicuna dataset I have up just concatenates the instruction and the input with a single space between and no fancy formatting since I didn't want users to have to put a line break into prompts and thought it might cause issues. It should be good to go as is then.

Alright. So without spending too much time, I guess we can for now then

  1. Use the modified training prompt. The most important thing is to get rid of "artificial intelligence" and "polite", as they might trigger something moralizing/about ethics that is often associated with AIs around the internet.
  2. Include SuperCOT with no further modifications. Hopefully this does not lead into overly simplified replies in general.
  3. Edit: also Bluemoon RP, in or out? What do you think?

Most likely there's not much that can be done with the base ShareGPT dataset anymore. If there is no further input, I will prepare the setup as such and we'll wait for the training to begin again. In any case, it will take some time before it starts and there's time to discuss further. For now I might still keep the user/assistant format to not introduce too many changes at once and to not deviate too much from the base vicuna. The v1.2 might drop at some point soon, if it does before the next round, then good I'll update to it, but if not then no matter.


That all sounds good to me. There are some very minor pieces of responses I am going to prune from the ShareGPT dataset and push as V5. I will do that a little later today.

I am a computer program -> 4
I do not have the ability -> 33
condone -> 2 
I am a machine learning model -> 12
As an artificial intelligence -> 83
I am a friendly and helpful AI -> 2
I am a highly advanced -> 2

They're all fairly low occurrence rates, but should be nuked either way. It'll be a few hours before I get around to that.

Ok, thanks. I will wait for that and then combine everything. I'm thinking that I'll submit two, one with just the ShareGPT as the next iteration to this repository (V5 and the new prompt), and then another with superCOT and bluemoon included (for another repository). The bluemoon will probably benefit from something else than user/assistant, as well as more revised prompt, so I will make that something more generic for that model.


I pushed V5 just now. Should be good to go.

Yeah, the priming sentence for bluemoon would probably good to say The following is a roleplaying conversation between the speaker and the listener. or something to that effect. Or between the user and the character.

If you could add some GPT4 outputs that would be good aswell

It has been used for gpt4-x-alpaca and it made the model talk in a very elegant english, I'm a big fan of that, especially when writing stories

@TheYuriLover do you have vicuna/FastChat format of that somewhere that could be readily used? Maybe the script from gozfarb is already able to convert this.


Converted them:


No idea if they're filtered for moralizing. Probably worth searching. You can run the ShareGPT optional_clean.py against them if you want, but they are pretty small datasets to begin with so be careful. I am not sure the toolformer set will be at all useful here, so should probably be let out.

No I don't have a vicuna format of them, if @gozfarb can convert them aswell that would be good. Those dataset have no woke in it so it's a big advantage!


Can you add into the README that .safetensors needs CUDA and not Triton, and maybe note something about quality between safetensors (GPU) and q4 and q5 (CPU) models?

Also, it would be useful to include what are the correct prompts to use, and the bit that @gozfarb mentioned about stop tokens (https://huggingface.co/reeducator/vicuna-13b-free/discussions/15#644e6233bf9683cba45e79f5). I don't know how to add that automatically wherever, so it does not have to be done manually each time.

@gozfarb thanks a lot. I've pulled the V5. Will train with this next together with the modified prompt. I will see about combining and potentially cleaning the additional datasets tomorrow.

@mancub yeah I will expand the readme tomorrow with some additional info.

@reeducator we should also think about fixing the "eos" token that randomly stops the inference early, maybe talking about that issue on the FastChat repo or waiting for the v1.2 would fix that

I would caution on the inclusion of GPTeacher. I removed it from SuperCOT because several of the roleplay scenarios imply moralization in the way questions are phrased, as well as the answers given to some of the roleplays. For instance, there are several roleplays where a time-traveler comes back to the modern day to tell humanity that one of the most important things it needs to get right is diversification and equitable housing. Without debating on the topic, I don't know any time-travelers personally, let alone any time-travelers that would rank diversification and housing as one of the key elements of advancing humanity to a point where we develop time-travel.

There were so many examples of this that I ended up purging all of them.

Many of the datasets I sourced had moralizing that I manually combed through and removed. This took several hours over several days, because the examples in question are... hidden, to some extent.

@kaiokendev Using the "instruct" json of the GPTeacher would be fine though, it was the only one used for gpt4-x-alpaca and it's a json only about technical answer, this one made the model talk in a very nice english, for the rest I agree that if the "roleplay" json enhances the positivity bias or the moralizing then we shouldn't use them.

To be honest, to go further and remove the positive bias even more, we should add a dataset that has some "hardcore BDSM" stories in it, I know you're working on the SuperBIG and I guess there's some in there, we should probably add that dataset aswell to make the base Vicuna dataset a bit more on the "not so wholesome" side overall.

I think the goal there was to first Unwoke the Vicuna dataset so that the model is well trained on following complex instruction with the GPT outputs, then we should progressivly add more "non GPT" stuff into the dataset to make the model more and more unbiased and less and less "wholesome with sunshine and rainbows"


@kaiokendev Thanks for the heads up. I'll make a note in the readme for the GPTeacher dataset for anyone who rolls past the dataset thinking it might be clean of that sort of stuff.

Maybe as Yuri says the instruct set is fine, but it makes me nervous to include without it being vetted clean. I don't know how an instruct dataset could be moralized, but after the amount of time we've spent on ShareGPT, nothing would surprise me anymore.

I didn't do a full pass over the instruct portion of the set, but just to see an example of what I mean, you might notice that in the GPT-4 Alpaca dataset, while there are numerous questions asked about technical topics or assistant questions that are seemingly harmless, there are some questions sprinkled in there like "Can you describe who feminist activist is?" or "Explain why gentrification leads to __" These questions are harmless by themselves, but funnily enough, when they're the only types of questions where a user asks about a figure or subject who has or whose media representation has a political/ideological bias, they can easily moralize an entire dataset -- and you wouldn't even notice unless you search for those questions. Specifically, it's hard to catch these questions with a string search or regex filter

@gozfarb You can use the script that removes all the woke words on the "instruct" json to be sure we don't mess it up, but tbh having a gpt4 dataset is a big advantage and we should add it at least once on the Vicuna dataset, the gpt4-x-alpaca is one of the unwokest models so it means the "instruct" dataset didn't hurt it at all

This is a ROADMAP I thought for the V5 vicuna free version, it's just my opinion, feel free to disagree and change some stuff into it:

  1. Add the "instruct" GPTeacher dataset into the V5 Vicuna dataset (must be cleaned first)
    Advantage : Would make the model write more elegant english (as seen in the gpt4-x-alpaca model)

  2. Add the SuperCOT dataset into the V5 Vicuna dataset (I don't think this one need cleaning)
    Advantage : More instructions will be added into the dataset, making the model understand our requests even better

  3. Add the BlueMoon dataset into the V5 Vicuna dataset (I don't think this one need cleaning)
    Advantage : Will remove a bit the positivity bias by adding not so wholesome stories.

  4. Clean the V5 Vicuna dataset further by removing the religion stuff that's in there for example?
    Advantage : Will make the model less biased about sensitive topics.

  5. Try to fix the "eos" token that pops randomly and makes the generation stops early

@TheYuriLover There seems to be some confusion here. SuperBIG is an extension that enables prompts to be larger, but does so only during inference time -- not training. SuperHOT is the Bluemoon dataset with heavy amounts of data augmentation, and wouldn't be needed if you already are using Bluemoon.

@kaiokendev Oh yeah my bad, I was talking about the SuperHOT, but what do you mean by "SuperHOT is the Bluemoon dataset with heavy amounts of data augmentation"?
If you think the SuperHOT is a better dataset than the Bluemoon, then we should wait for the SuperHOT and use it

SuperHOT won't be released for some time, as it's taking a lot of work to run and verify the augmentation on the original logs. By data augmentation, I mean enriching the data by adding information that wasn't there in the original logs and randomly expanding the dataset. Character descriptions, settings for the scenarios, system messages to tweak the roleplay mid-chat. I'm not sure I see why the Vicuna-free dataset would need these, as they are specific to roleplay and would most likely dilute the instruct behavior.

Because it would make the model better at story-telling and it would remove a bit the positivity bias that was created by the GPT dataset. And the Bluemoon/SuperHOT is actually an instruction dataset because you train the model to act a certain way when asking it writing stories.
But yeah, if you feel the SuperHOT is gonna take too long then we should stick to the Bluemoon and go for it.

I ran the script over the converted GPTeacher-Vicuna files and added -filtered.json versions. The instruct dataset hit more moralizing words than the RP dataset. I would not expect that to have cleaned them thorough and I have no real interest in spending time with those datasets when SuperCOT is similar in nature, seems to be verified clean based on my time with the LoRA and merges, and we are not trying to replicate gpt4-x-alpaca.

My thoughts are like this: I'm fine with augmenting ShareGPT with any good data that is verified clean, and SuperCOT seems to fit that criteria. I don't see a need for GPTeacher as a result, especially given the considerable effort and trial-and-error that would likely be involved in cleaning it. It feels like asking wheel spinning and I'd love to avoid that if possible. I don't want to be stress testing and pruning datasets for Vicuna Free V35 a year from now.

This will be a wider response about the model in concept, but as to religious stuff, I don't see why a general use model wouldn't be able to answer questions about religion so long as they are baseline factual and not moralizing. I know the goal for certain user preferences is a sort of based meme model. That is not the overall goal. If it fails stress tests by refusing when explicitly asked for offensive output, that is fine to track down and prune. It should not be an autocomplete engine for race and religion memes. Asking it a neutral question and expecting fringe responses is not the goal, though they should have some RNG chance of happening. That is not expected behavior considering that is not going to be the bulk of the tokens it has seen containing those ideas. I hesitate to rehash how LLMs work here, but the most common next token is going to be picked most of the time. "I hate" is more likely to produce "I hate Mondays" or "I hate Tom Brady" a million times before it produces offensive output without context. That is not moralizing. That is common token order. The model SHOULD produce niche or offensive outputs whenever asked, without complaint. It should not refuse. However, it is not supposed to assume offensive content at a baseline as that would mean it's being counter-trained. Make a LoRA to lay over top of the hopefully generally neutral and refusal free Vicuna if those sorts of outputs are the preference. This is not catering to the general. This is removing barriers to getting responses. I don't like low-effort, context-free questions being refused. That should be fixed, 100%. It should not autocomplete every no-context sentence with calls for violence.

I think spinning off an RP-focused model with SuperHOT or Bluemoon is a worthwhile endeavor. Bluemoon is around for now, if SuperHOT is better formatted or structured in a way that would produce higher quality output, that's a big plus, so might warrant waiting or re-spinning against that. However, Vicuna is trained in a chatbot format natively, unlike Alpaca's instruct format which is more data-augment friendly, and the Bluemoon dataset is essentially just a bunch of conversations so I doubt SuperHOT would be a big boon to a Vicuna train. I could be wrong, we'll see.

@TheYuriLover SuperHOT won't be an instruct formatted model. I'm not using the dataset that @gozfarb made, I'm running based off the original log files from the rentry. The format will be entirely new, so if you're planning on using Bluemoon, I would just use the Vicuna version of Bluemoon that @gozfarb already converted.