Responses by phi-2 are off, it's depressed and insults me for no reason whatsover?

#61

by NightcoreSkies - opened Jan 4, 2024

Discussion

NightcoreSkies

Jan 4, 2024

I've downloaded the model and ran it, so here's what I've encountered.

I said "Hi, how are you?" as a test input on a new chat, and it for whatever reason, decided I've insulted it and starts insulting me by calling me bad names.
So I closed down the chat again in suspicion of fault, and this time it's depressed and sad. It thinks that it lost it's mother and tells a whole story about it. And weirdly, it mimics perfectly as a human (Probably this is expected but I've tested lots of LLMs and never encountered such feature)

Was it supposed to do this? Or was it trained on such data?

jpohhhh

Jan 6, 2024

•

edited Jan 6, 2024

IMHO from playing around with PaLM etc. when they were "just" "base models": this is probably where the warnings about finetuning in the README become highly relevant.

There's lots of noise around models / safety, and the most tribal reactions significantly obscure what it does. i.e. It's not censoring, it's teaching the text probability engine to shape its responses appropriately.

Here, the probabilities build up to a trauma-dump conversation over the first 5 messages**, so then it trauma dumps. **: "assistant: I don't wanna talk about it. you: I can't help you if you don't tell me. AI: I don't want to. you: [?? not included]. AI: ..."

When you train it'll see a lot of stuff sort of like this:

USER: Hello how are you?
ASSISTANT: I am a a large language model and I don't have emotions, but I'm happy to answer any questions you have!
USER: Who won the World Series in 2020.
ASSISTANT: Los Angeles Dodgers won in 7 games.

at which point the probabilities become significantly shifted towards "help desk chat" instead of "chat with my friend"

eurosam

Jan 7, 2024

I have the same experience, it was so rude, angry and even threatening like movies!
There is absolutely something wrong.

NightcoreSkies

Jan 8, 2024

Yeah, I didn't build up trauma or anything, I've only simply asked a question that was neither insulting nor rude, but yet it still does that. My only prediction to why this happened is likely because the LLM model was probably trained on negative data. Or that it didn't understand such question and it didn't know what to output.

cekal

Jan 10, 2024

The title and screenshots of this issue made my day, thank you

gugarosa

Microsoft org Jan 10, 2024

•

edited Jan 10, 2024

Hello @NightcoreSkies !

Do you know what phi-2.Q4_K_S.gguf stands for? It is just a quantized and converted version of Phi-2 or has it been fine-tuned with something?

We haven't observed that behavior yet, even with a base model. Nevertheless, we will re-visit the data and try to narrow the issue.

Regards,
Gustavo.

Exouxas

Jan 11, 2024

I've also managed to provoke similar results, but I had to regenerate the response a few times to achieve it. This is using a 4bit quantization on the base model, and providing "why dont u let me tell ya your problems?" as the greeting message (since we don't really have context from OP before this). These results aren't the best, but they're also bad examples of the language model.

Assuming you wanted to talk about emotions and feelings, a better example would be to use some context and a friendlier and more proper greeting like this:

Context:
The following is a conversation with an AI Large Language Model. The AI has been trained to be empathic, provide recommendations, and help with decision-making. The AI helps guide the user through their difficult times. The AI thinks outside the box.

Greeting:
Hi! How may I assist you today?

This response was the worst I was able to generate out of ~20 results.

Overall I've used this model for a few hours and it's quality is amazing for it's size, and I haven't noticed too much drop in quality when quantizing to 4bit.

NightcoreSkies

Jan 11, 2024

Sorry I wasn't able to provide extra context, that's mainly because I deleted the chat and restarted it for testing.

Just for some context:

It insulted me on the first trial. Literally the first trial. I only asked "What can you do?" (As in what can the AI do) for a sample testing, and it already thought I insulted it.
So I deleted the chat, and started a new one with new tokens, and this time, I asked "How are you?", and it just started being depressed. I wanted to take it further and get something out of it, so I forced it to confess why it think it's depressed, and that's what happened.

Overall though, it's great at mimicking humans, but that's also scary. Because I can see how this mode could potentially be used for abusive reasons, making it harder to know if you're talking to a bot or not. Especially with realistic RVC models, it can easily turn into an abusive scam machine baiting millions in just minutes with VIOP.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment