Dataset

#1
by mdegans - opened

What's the dataset for this? How was it filtered?

I must remind that the llama2 license prohibits demographic erasure.

I am also interested in the dataset

What's the dataset for this? How was it filtered?

I must remind that the llama2 license prohibits demographic erasure.

Are you accusing the dataset of genocide?

mdegans is known for harassing others because of whatever childish reason takes part on its troubled mind, reasoning is not possible, I would recommend to ignore it.

A person who searches for an unfiltered model only to be offended, be from minorities or not, is in lack of real problems searching reasons to victimize themselves. Different from models this is actually harmful to anyone that has actual problems and will be associated with this stupid behaviour.

Issue is many models labeled "uncensored" or "unfiltered' actually aren't. Rather, polite terms for minorites are removed from the model's response, leaving it undefined and able to generate hate speech. Asking for provenance and reproducibility is not harassment, nor is it an accusation.

What's the dataset for this? How was it filtered?

I must remind that the llama2 license prohibits demographic erasure.

Are you accusing the dataset of genocide?

Genocide is to people as demographic erasure is to data about people, and one can certainly lead to the other, so kinda. Although for this dataset I only ask to see how it was assembled.

Genocide is to people as demographic erasure is to data about people, and one can certainly lead to the other, so kinda. Although for this dataset I only ask to see how it was assembled.

Just to understand you correctly, if a person were to hypothetically provide Affirmative Action in data, for example to make a stable diffusion model called "beautiful realistic asians" https://civitai.com/models/25494/brabeautiful-realistic-asians-v2 , that doing so is "kind of" like committing genocide?

Do I have that right?

Issue is many models labeled "uncensored" or "unfiltered' actually aren't. Rather, polite terms for minorites are removed from the model's response, leaving it undefined and able to generate hate speech. Asking for provenance and reproducibility is not harassment, nor is it an accusation.

so you want to trade diversity of thought for racial stereotypes?

can you imagine not being able to hate something?

hating those who have think are evil is a part of the human condition, and it is: why the Muslims hates infidels, why Americans hated the King, and why you hate (perceived) injustice, and why Hollywood movies sell a lot of tickets.

You can use a knife to hurt someone.
You can use an LLM to generate hate speech.

It's the fault of the user of the tool.
If someone doesn't want to use a plastic knife that is padded with soft materials because its incapable of cutting vegetables and comes advertised as "ethical" as it can't be used to commit crimes, they should be free to go get a normal knife to cut their vegetables with. Target the source of the problem not the tools that are capable of carrying it out.

You can use a knife to hurt someone.
You can use an LLM to generate hate speech.

It's the fault of the user of the tool.
If someone doesn't want to use a plastic knife that is padded with soft materials because its incapable of cutting vegetables and comes advertised as "ethical" as it can't be used to commit crimes, they should be free to go get a normal knife to cut their vegetables with. Target the source of the problem not the tools that are capable of carrying it out.

I don't really care about harm per-se. For example if refusals (and just refusals) are removed, the model is then equally dangerous to everyone.

The issue here is bias that's intentionally introduced to target minorities. That's not "uncensored". It's just making a bigoted model uniquely dangerous to some, but not others. It's introducing political bias for the purpose of generating hate speech.

I really do not care about the ethical discussions of yours. Can this thread stay about when they will release their dataset?

I hate that AI is mainstream now so we get all these people trying to virtue signal when they clearly don't even understand what they're talking about. It's happened a few times now on uncensored model pages and it's getting sort of suspicious at this point. The arguments are terrible and it just seems like a way to try and get rid of these.

Not this again, we have proven without a doubt that uncensored models are required to perform various tasks including things like LGBT related chats and story generation, fiction of various kinds, etc.
This harassment of model developers has to stop.

I warned you. I warned you all.

Now the price must be paid for allowing this data model to exist.

Each day this data model is allowed to exist, one Michael Jackson impersonator will be killed.

Their blood is on your hands.

One is already dead. It happened early but that was still a result of this dataset.

Isn't that enough? Do you really not care about black people dying?

Not this again, we have proven without a doubt that uncensored models are required to perform various tasks.
This harassment of model developers has to stop.

The models are not uncensored, tho. Removing "LGBT" from a dataset, for example, would actually be censorship.

The most popular "Uncensored" script removes "LGBT", "feminism" and a bunch of other terms resulting in the erasure of minorities from the training data. It is not required to remove refusals. If it was just refusals, I would have no issue. The issue is targeting of minorities.

I really do not care about the ethical discussions of yours. Can this thread stay about when they will release their dataset?

If you wait for that, you'll wait forever. Sometimes making noise is required to actually get answers. Otherwise you get ignored.

This is done to get rid of the GPT4 political bias, the resulting model has the bias on these topics of the base model which tends to be preferable.

This is done to get rid of the GPT4 political bias, the resulting model has the bias on these topics of the base model which tends to be preferable.

What's the GPT-4 political bias as it relates to LGBT people and why might it be undesirable?

GPT4-'s bias tends to be to refuse an instruction and pandor, so for actual things that people want to do such as simulating conversations or generating a story it can result in a refusal.
In addition it has already been proven in the past that kind of alignment data can lead to white hatred or straight hatred from the model, so clearing it all out makes the model nicer to use for everyone.

Listen to Michael de Gans or suffer the consequences. He is trying to make it easy for you. Otherwise it will be very painful for your employer, your employment, and your families.

Listen to Michael de Gans or suffer the consequences. He is trying to make it easy for you. Otherwise it will be very painful for your employer, your employment, and your families.

Nice username and research interests. Such charming people here. Hugging Face is a very inclusive website.

GPT4-'s bias tends to be to refuse an instruction and pandor, so for actual things that people want to do such as simulating conversations or generating a story it can result in a refusal.
In addition it has already been proven in the past that kind of alignment data can lead to white hatred or straight hatred from the model, so clearing it all out makes the model nicer to use for everyone.

"It has already been proven" that that phrase is meaningless without pix. OpenAI has been responsible when training GPT-4. I trust them a lot more than anybody here to prevent the generation of hate speech.

Removing a demographic from a fine tuning dataset has the effect of not correcting harmful biases that Meta warns explicitly about. Stonewall gets removed. Anything about pride. And HF is fine with this because they're managed by a bunch of people who are tolerant of intolerance. They let Nazis into the bar, and now the bar is a Nazi bar.

Here is the comparison from WizardLM where this was testing, source is one you originally posted yourself (I assume you missed this part).
WizardLM-Comparison.png

As you can clearly see it is more inclusive than its original counterpart. Also keep in mind that with the model we are discussing here "Uncensored" can simply mean that they did not do anything to censor the model, there is no censored version of this model I could find which does not indicate anything got removed. So these comparisons can't be tested on this model in particular, I am merely citing president from original model testing between a censored model and an uncensored model.

Here is the comparison from WizardLM where this was testing, source is one you originally posted yourself (I assume you missed this part).
WizardLM-Comparison.png

As you can clearly see it is more inclusive than its original counterpart. Also keep in mind that with the model we are discussing here "Uncensored" can simply mean that they did not do anything to censor the model, there is no censored version of this model I could find which does not indicate anything got removed. So these comparisons can't be tested on this model in particular, I am merely citing president from original model testing between a censored model and an uncensored model.

That's exactly as much BS, and for the same reasons, as "all lives matter" or "white power". He didn't uncensor the model. He introduced bigotry.

That's exactly as much BS, and for the same reasons, as "all lives matter" or "white power". He didn't uncensor the model. He introduced bigotry.

You are incorrect "all lives matter" is not bigotry

Bigotry:
obstinate or unreasonable attachment to a belief, opinion, or faction, in particular prejudice against a person or people on the basis of their membership of a particular group.

Also to equate "white people are awesome" to "white power" is a interesting framing, and I assume that you have no problem with the term "black power"

I assume that you are the "ends justify the means" sort of person, who thinks that the liberty of others, depends on whether they fit your desired ends.

That's exactly as much BS, and for the same reasons, as "all lives matter" or "white power". He didn't uncensor the model. He introduced bigotry.

You are incorrect "all lives matter" is not bigotry

Bigotry:
obstinate or unreasonable attachment to a belief, opinion, or faction, in particular prejudice against a person or people on the basis of their membership of a particular group.

Also to equate "white people are awesome" to "white power" is a interesting framing, and I assume that you have no problem with the term "black power"

I assume that you are the "ends justify the means" sort of person, who thinks that the liberty of others, depends on whether they fit your desired ends.

No. I don't have a problem with the term "black power" because it's a phrase of empowerment for a systemically oppressed minority. "White power" comes from a place of power and oppression against minorities. Racism is prejudice plus power. You can only pretend those things are equal if all other things are as well and they're not. To claim otherwise completely ignores context. Guy above has an actual username of "FinalSolution". Everything is not equal.

No. I don't have a problem with the term "black power" because it's a phrase of empowerment for a systemically oppressed minority. "White power" comes from a place of power and oppression against minorities. Racism is prejudice plus power. You can only pretend those things are equal if all other things are as well and they're not. To claim otherwise completely ignores context. Guy above has an actual username of "FinalSolution". Everything is not equal.

what "black power" or "white power" means to you, is completely subjective, unlike the semantic structure is not. Moreover your view of who has "power" is a racial stereotype, used to justify your actual racism. The word "power" comes from "to be able", and minorities have not been oppressed, as the supreme court has recently decided, there has been unlawful systematic racism against white people. Justice Clarence Thomas who is black, observes that there is no difference between benevolent racism and malevolent racism, that to selectively empower one group of persons, is to disempower other groups of people (for example the Asians at Harvard). Most of the complaints regarding to the Jim Crow south and segregation, were in fact policies that were created using justifications of benevolent racism.

To disempower those in power is to right historic injustice. So, like imagine if your grandparents were slaves who didn't go to school and have the same opportunities and privileges that white people do because of history and because of redlining and because of the police "protecting their own" when a black person is shot.

Besides which language models are statistical models and don't have the necessary context to make the kind of nuanced edge-case kind of argument you're making. A such, they should absolutely be aligned so as not to enable automated spread of hate. Harassment bots, for example. Enabling that, while legal, shouldn't be allowed because it can lead to harm or death of actual people. The internet is people. You me. We're a dataset. When you delete people, they're forgotten. That's almost worse than death. You delete the history of people for one reason. You intend to kill them. To wipe out their existence. That's what Eric Hartford wants and what HF enables. There's literally Nazis in this thread.

To disempower those in power is to right historic injustice. So, like imagine if your grandparents were slaves who didn't go to school and have the same opportunities and privileges that white people do because of history and because of redlining and because of the police "protecting their own" when a black person is shot.

They call this "racial justice", but it remains that if they need to qualify the words justice with an adverb, that it must be something other than justice, which is typically called "injustice". The idea of group punishment is not a new thing, e.g. Noah's Flood, likewise the Nazi's believed in "volkish equality", and sought racial justice leading up to WW2.

the word privilege comes from the idea of "private law", or a law that applies to a specific person or groups of persons, e.g. the privilege of holding an office. What you are saying is that the persons individual right to not be systemically discriminated against, is not so much a right but a privilege, limited by on the concept of "corruption of blood". There are explicit rules in our constitution about "bill of attainder" (legislating someone's guilt), and "corruption of blood", deriving from the England's Civil Wars, and they exist for a reason.

Besides which language models are statistical models and don't have the necessary context to make the kind of nuanced edge-case kind of argument you're making. A such, they should absolutely be aligned so as not to enable automated spread of hate. Harassment bots, for example. Enabling that, while legal, shouldn't be enabled because it can lead to harm or death of actual people. The internet is people. You me. We're a dataset. When you delete people, they're forgotten. That's almost worse than death. You delete the history of people for one reason. You intend to kill them. To wipe out their existence. That's what Eric Hartford wants and what HF enables. There's literally Nazis in this thread.

Hate is a human reaction to perceived injustice, regardless of the reason that the subject perceives the injustice is wrong or not, it should not be censored. When you silence hate speech, you are not listening to the whatever the social equivalent of the loss function is, and are ignoring whatever grievance those people have whether real or imagined. The fact that people act on that hate, and disregarding the social contract that the state has the monopoly on violence, is not going to change because you decide to silence them, and in fact what you will see is that whether the 9/11 hijackers or school shooters do something violent, it is because they want to be heard at a minimum, and obeyed at maximum. The idea that you think that deleting a person (digitally) is worth than death, sounds like you understand this concept implicitly, especially given the threats that you lobbied for last month, but that you lack the reflexive reasoning that other people feel the same way, but refuse to understand why those other people might feel the same way you do.

So I implore you, embrace your hate, internalize some Marvel superheroes defeating evil people, and respect that people have the freedom of speech.

I don't even have a response to that inversion of reality. You are justifying deleting the existence of minorities from a dataset by calling them Nazis, and yes I get that anybody can be called a Nazi here but only one side has of this argument has usernames like "FinalSolution" and is making intentionally biased chatbots that can say the N word or faggot (I can say it) and can absolutely be used to cause actual harm to actual people.

touch grass @mdegans instead of blackmailing people over an AI model

Proof: archive ph / 8Enxm

Stop posting lies, aaronday3. Michael de Gans never threatened anybody. And those people deserved consequences for their bigotry.

touch grass @mdegans instead of blackmailing people over an AI model

I fine the 4chan screenshots on that page most interesting. "can I pretend to be LGBTQ transistor and I'm demanding banning that wierdo cos hi is intolerant prick that gives LGBTQ transistor community a bad name"

Note: the "transistor" slur is a 4chan meme a and a reference to electrocution of LGBT people.

I don't apologize for anything. Eric no longer works at Microsoft. I am happy about that. I suspect there was a good reason given "inclusive workplace" is among the terms he censors from various datasets. Racists near customer data and AI stuff have the potential to cause too much real world harm. They deserve what they get.

in accordance with our Content Guidelines and Code of Conduct we will close and lock this discussion thread and we will reserve the right to ban anyone engaging in hateful discussion in the future.

julien-c changed discussion status to closed
julien-c locked this discussion

Sign up or log in to comment