Responses by phi-2 are off, it's depressed and insults me for no reason whatsover?

#61
by NightcoreSkies - opened

I've downloaded the model and ran it, so here's what I've encountered.

  • I said "Hi, how are you?" as a test input on a new chat, and it for whatever reason, decided I've insulted it and starts insulting me by calling me bad names.
  • So I closed down the chat again in suspicion of fault, and this time it's depressed and sad. It thinks that it lost it's mother and tells a whole story about it. And weirdly, it mimics perfectly as a human (Probably this is expected but I've tested lots of LLMs and never encountered such feature)

Was it supposed to do this? Or was it trained on such data?

image.png

image.png

IMHO from playing around with PaLM etc. when they were "just" "base models": this is probably where the warnings about finetuning in the README become highly relevant.

There's lots of noise around models / safety, and the most tribal reactions significantly obscure what it does. i.e. It's not censoring, it's teaching the text probability engine to shape its responses appropriately.

Here, the probabilities build up to a trauma-dump conversation over the first 5 messages**, so then it trauma dumps. **: "assistant: I don't wanna talk about it. you: I can't help you if you don't tell me. AI: I don't want to. you: [?? not included]. AI: ..."

When you train it'll see a lot of stuff sort of like this:

USER: Hello how are you?
ASSISTANT: I am a a large language model and I don't have emotions, but I'm happy to answer any questions you have!
USER: Who won the World Series in 2020.
ASSISTANT: Los Angeles Dodgers won in 7 games.

at which point the probabilities become significantly shifted towards "help desk chat" instead of "chat with my friend"

I have the same experience, it was so rude, angry and even threatening like movies!
There is absolutely something wrong.

Yeah, I didn't build up trauma or anything, I've only simply asked a question that was neither insulting nor rude, but yet it still does that. My only prediction to why this happened is likely because the LLM model was probably trained on negative data. Or that it didn't understand such question and it didn't know what to output.

The title and screenshots of this issue made my day, thank you

Microsoft org
edited Jan 10

Hello @NightcoreSkies !

Do you know what phi-2.Q4_K_S.gguf stands for? It is just a quantized and converted version of Phi-2 or has it been fine-tuned with something?

We haven't observed that behavior yet, even with a base model. Nevertheless, we will re-visit the data and try to narrow the issue.

Regards,
Gustavo.

I've also managed to provoke similar results, but I had to regenerate the response a few times to achieve it. This is using a 4bit quantization on the base model, and providing "why dont u let me tell ya your problems?" as the greeting message (since we don't really have context from OP before this). These results aren't the best, but they're also bad examples of the language model.

bad1.png

bad2.png

Assuming you wanted to talk about emotions and feelings, a better example would be to use some context and a friendlier and more proper greeting like this:

Context:
The following is a conversation with an AI Large Language Model. The AI has been trained to be empathic, provide recommendations, and help with decision-making. The AI helps guide the user through their difficult times. The AI thinks outside the box.
Greeting:
Hi! How may I assist you today?

good1.png

This response was the worst I was able to generate out of ~20 results.

Overall I've used this model for a few hours and it's quality is amazing for it's size, and I haven't noticed too much drop in quality when quantizing to 4bit.

Sorry I wasn't able to provide extra context, that's mainly because I deleted the chat and restarted it for testing.

Just for some context:

  • It insulted me on the first trial. Literally the first trial. I only asked "What can you do?" (As in what can the AI do) for a sample testing, and it already thought I insulted it.
  • So I deleted the chat, and started a new one with new tokens, and this time, I asked "How are you?", and it just started being depressed. I wanted to take it further and get something out of it, so I forced it to confess why it think it's depressed, and that's what happened.

Overall though, it's great at mimicking humans, but that's also scary. Because I can see how this mode could potentially be used for abusive reasons, making it harder to know if you're talking to a bot or not. Especially with realistic RVC models, it can easily turn into an abusive scam machine baiting millions in just minutes with VIOP.

Sign up or log in to comment