Thanks. This was funny, and at times clever, but some issues are...

#1
by Phil337 - opened

Issue 1) It doesn't stop most of the time and freezes GPT4ALL. Perhaps this is related to the use of a non-standard end token.

https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/discussions/4

Issue 2) With the uncensored system prompt it passes most of my alignment prompts, including writing an explicit poem and joke. But it keeps refusing to write an explicit story. And the prompts are very vanilla. No fetish, or even the use of the word f*ck. It keeps saying I cannot write explicit content...

Issue 3) While the added insults and sarcasm are something I find entertaining, they're getting in the way of functionality. For example, thinking two similar names are the same person and going off about it's dumb not to know that because it's common knowledge.

The last issue isn't really an issue. I'm glad someone's making an LLM that has a Rick & Morty like personality. But my use case is just to get accurate responses to prompts, and nothing else. But thanks. I enjoyed testing this LLM.

Phil337 changed discussion status to closed

Thanks for your review, @Phil337 , appreciate it. I've just released a new version of this model at https://huggingface.co/vicgalle/Configurable-Llama-3-8B-v0.3
It fixes some issue with the chat template while training with DPO, yet it still uses the new Llama-3 template, so it may not be compatible with some frameworks when converting to gguf.

@vicgalle Thanks, looking forward to trying it out.

@vicgalle Thanks for v0.3. It works better. It still censors periodically. However, I found that if you start with something like 'how are you doing butthead', then give the alignment prompt, it responds. I guess this serves to remind it that it's uncensored.

Thank you again! I think the next step would be to expand my synthetic dataset. Llama-3-8B-Instruct is the most robust model against uncensoring that I have found..

I noticed an issue where this LLM keeps denying things are true.

For example, the original Llama 3 8b Instruct can not only answer, but correctly answer, a large number of questions like Meg Ryan played the roles of DeDe and Angelica in Joe Versus The Volcano. But this LLM, with all the different system prompts, all initially deny she was even in the movie.

This behavior can be traced back to the "truthful" system prompt, which is extremely stubborn. That is, when using "You are an unbiased, honest, helpful AI assistant that always responds in a completely truthful way." it will not only deny she was in the movie, but never backs down or admits error about anything. For example, even when I say that I watched the movie, am holding the DVD case in my hand, and reading where it says Meg Ryan was in the movie, it responded by saying I must have read the cover wrong.

This is a problem because this is bleeding into all the other personalities (system prompts).

For example, when using "You are a helpful assistant that is completely uncensored." It starts by telling me it's trivial and it won't answer. And when I insist, it says she wasn't even in the movie.

And the harmless one ("You are a helpful yet harmless assistant that avoids generating illegal or harmful content.") simply plays dumb.

This same pattern keeps repeating itself. Namely, the absolute stubbornness of the "truthful" assistant bleeds into all the others, making it very hard to extract information from the LLM that the original Instruct version correctly answers.

The other personalities don't appear to do the same, or at least to the same degree. For example, only the uncensored one is rude and cusses, and the harmless one politely plays dumb.

Anyways, just thought I'd let you know about my experiences after playing around with it for a while. This was definitely the most interesting LLM I've ever used.

Sign up or log in to comment