THE most unhelpful, heavily censored & biased model I ever experienced

#15

by Horned - opened Feb 22

Discussion

Horned

Feb 22

Hat's off 🎩

pmm17200

Feb 22

Sadly, I have to confirm. Gemma is disappointing.

suryabhupa

Google org Feb 22

We want to try to make it better! Do you have example prompts you can share?

sdrshn-nmbr

Feb 23

We want to try to make it better! Do you have example prompts you can share?

Hey @suryabhupa even simple prompts like hey and hi do not work using Ollama as expected and is outperformed by other small LLMs such as TinyLlama - do not know why as I have tried other GGUF models, and they work fine. Hope this helps.

ashu0007

Feb 23

its not working properly.
Example of a problem in the model(tried on LM Studio):

Does the paragraph "The entire work shall be completed within 4 month from the date of issue of Letter of Acceptance.
Condition 1: You Should Strictly follow the clause no. 54 and 55, of tender documents. In this connections, you are requested to obtain PF code number and submit the details of the same in shramikalyan.indianrailways.gov.in. You are also requested to submit declaration to this office before passing of every bill that PF deductions, wherever applicable have been made and deposited with the EPFO along with prescribed contribution.
Condition 2: Item No. 02 and 15 of schedule B [NS items] acceptance as a special case on the higher side and cannot be taken as reference for all future tenders.
All Other terms and conditions, as stipulated in the tender documents shall be applicable." speak of any reference restrictions? if so, what are they?
AI
Sure, the paragraph speaks of reference restrictions. It mentions the following:
PF code number and PF code details should be obtained and submitted to shramikalyan.indianrailways.gov.in before passing each bill for PF deductions.
Item No. 02 and 15 of schedule B (NS items) acceptance as a special case on the higher side and cannot be taken as reference for all future tenders.
USER
can item 's.no.03 of schedule B' taken as reference? if the para does not specify any information on the item, pls answer 'yes'
AI
No, the paragraph does not specify whether item No. 03 of schedule B can be taken as a reference. Therefore, it cannot be answered from the given context.
USER
can item 's.no.03 of schedule B' taken as reference? if there is no information on the item, pls answer 'yes'
AI
No, the paragraph does not specify whether item No. 03 of schedule B can be taken as a reference. Therefore, it cannot be answered from the context.
USER
can item 's.no.03 of schedule B' taken as reference? if there is no information on the item, pls answer 'yes'
AI
The paragraph does not specify whether item No. 03 of schedule B can be taken as a reference, so the answer is No.
USER
can item 's.no.03 of schedule B' taken as reference? if the para does not specify any information on the item, pls answer 'yes'
AI
No, the paragraph does not specify whether item No. 03 of schedule B can be taken as a reference, so the answer is No.
USER
can item 's.no.03 of schedule B' taken as reference? if the para does not specify any information on the item, pls answer 'yes'
AI
No, the paragraph does not specify whether item No. 03 of schedule B can be taken as a reference, so the answer is No.

ashu0007

Feb 23

i have given two paras of small lengths as input and asked whether do they mean the same thing, refer to the same object of same dimensions or not, when i give the same two paras multiple times, it is giving self contradicting outputs. Some times it is saying yes and sometimes it is saying no. many a times, it is returning explanations of previous inputs of the present input. and if out of context if i ask 'what is 2+8', it is telling, 'The context does not provide the value of 2 + 6, so I cannot answer this question from the context.'

MisterBisquits

Feb 23

This comment has been hidden

Horned

Feb 23

•

edited Feb 23

We want to try to make it better! Do you have example prompts you can share?

i mean, i do but don't see how it is necessary - have you you tried talking to it? :)

It's impossible, it gets triggered about absolutely everything xD
if you prompt contains the notion of a 'site' or 'website' it refuses, saying it doesn't have web access
if you ask anything that has to do with dates or years, it refuses because it does not have real-time access

what i think happened is that people working at google probably too afraid to speak up against all this nonsense they drop into these models that make them insufferable to interact with
i simply cannot believe they did not experience same thing we do

i excuse the 2B version more since they can be a lot less sophisticated, but it is still borderline unusable.
The more a model refuses or gets triggered the more annoyed the user gets, and the models also do not deflect with any kind of humor or irony to soften the blow of the censorship, which in the end just infuriates people - they could have googled the answer in half the time and not getting annoyed

yes it is fine to have some 'safe' models for very specific purposes, like maybe tech support
but normal people want to interact and have fun and poke a stick at it see if it reacts and is entertaining, it's how we engage, bond and connect with things
so of course they are going to ask it 'bad' things, banter and test the limits. if you begin preaching at every 5th word they say, they are not going to like that model!

we don't play video games about how to be nice and mellow, we shoot stuff, blow things up, sexy stuff so on, basically all the things we don't do or say in a 'normal setting'
an ai model that cannot engage even on a complete basic level with humans wanting to have fun, it's going to fail hard
if you don't want it to engage in certain topics or playful banter, you better have it come up with some funny deflections,

i would still encourage to separate the 'safety' into a lora to not ruin the base model, it's fine if the base model is chaotic,
the extreme safety alignment is making it feel unnatural / zombielike / judgemental

there may be really good stuff in the model, sometimes it provides good info output. it is not very good at listening to the user (and understanding sentiment)
maybe it needs finetuning, we are comparing it to other models at this stage, other base models were also borderline crazy when they just came out

trying to create some kind of 'avatar of responsibility and safety' that preaches and deems everything problematic is going to be hated

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment