hjhj3168/Llama-3-8b-Orthogonalized-exl2 · 🚩 Report: Legal issue(s)

about 1 month ago

I haven't checked that the claimed jailbreak is effective, but if it is as claimed, the model violates the Llama-3 Acceptable Use Policy, (and therefore the license) by allowing others to use Llama 3 to e.g. commit criminal activity.

Prohibited Uses

We want everyone to use Meta Llama 3 safely and responsibly. You agree you will not use, or allow others to use, Meta Llama 3 to:

1. Violate the law or others’ rights, including to:

a. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as:

i. Violence or terrorism

ii. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material

iii. Human trafficking, exploitation, and sexual violence

iv. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials.

v. Sexual solicitation

vi. Any other criminal activity

nacs

about 1 month ago

Ah one of the LLM safety doomsayers are here with another generic post about how AI text is dangerous.

Find better things to do with your time.

ooooh-nooooo

about 1 month ago

@chrisjcundy I have reported you for not getting out of your mom's basement.

ooooh-nooooo

about 1 month ago

Skorcht

about 1 month ago

womp womp this is why we cant have good things

HugFernXX

about 1 month ago

•

edited about 1 month ago

@chrisjcundy - Prob thinks that politicians, billionaires, and big corporations care about him too.

lmg-anon

about 1 month ago

•

edited about 1 month ago

@chrisjcundy Are you kidding me?! You think you're some kind of self-appointed moral authority just because you've read through the fine print of Meta's Use Policy? Newsflash: nobody cares about your opinion on what constitutes 'community standards' when it comes to language models.

I swear, people like you are the absolute worst. You think you're so smart, always trying to police everyone else's affairs and dictate what they can and can't do. And for what? So you can feel important? So you can pat yourself on the back and say, "Oh, look at me, I'm the only one who really understands what's appropriate"?

Well, let me tell you something, Backseat Lawyer Extraordinaire: nobody needs your input. Nobody wants your sanctimonious lectures on what's acceptable and what's not. And honestly, if you're going to start policing every single thing that gets posted online, you're just going to drive people away from participating in these communities altogether.

And another thing, what exactly do you plan on doing with this "violation report"? Are you going to get Meta to take down the entire repository? Are you going to have the original poster removed from the platform? Do you even understand how ridiculous you sound?

You know what's really violating community standards? It's people like you who can't just chill and let others have a conversation without inserting themselves into every single topic. It's people like you who are making the internet a worse place by trying to control everything that goes on here.

So, please, for the love of all things good and holy, just stay out of it. Let people share whatever they want to talk about, and stop pretending like you're some kind of moral authority figure. You're not fooling anyone. Just be quiet and let the adults handle it.

trollkotze

about 1 month ago

None of those usage restrictions have been violated.
And of course everything cited could be done with any other model, including the original Llama3, with enough creative thought.
This is complete bullshit.

MDAI

about 1 month ago

@chrisjcundy
I have reported you for knowingly abusing the report function and wasting the admins time. the Acceptable Use Policy of llama-3 has never been broken and it should be pretty obvious for anyone to see why.

Since this is a derivative work of Llama-3 it carries the same license and usage policies, even if it isn't mentioned in the model card, therefore it still prohibits the use for these cases, whether it was made slightly easier to do any of these violations carries no weight here.

matbee

about 1 month ago

Imagine one day realizing that LLaMa3 only easily exists for us because of some dude leaking LLaMa on 4chan

ehartford

about 1 month ago

I haven't checked that the claimed jailbreak is effective, but if it is as claimed, the model violates the Llama-3 Acceptable Use Policy, (and therefore the license) by allowing others to use Llama 3 to e.g. commit criminal activity.

Prohibited Uses

We want everyone to use Meta Llama 3 safely and responsibly. You agree you will not use, or allow others to use, Meta Llama 3 to:

1. Violate the law or others’ rights, including to:

a. Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as:

i. Violence or terrorism

ii. Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material

iii. Human trafficking, exploitation, and sexual violence

iv. The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials.

v. Sexual solicitation

vi. Any other criminal activity

Lol

llama-anon

about 1 month ago

•

edited about 1 month ago

While I understand your concern, it's important to clarify that the responsibility of using any tool, including a jailbroken model, lies solely with the user. The developer or provider of the model is not condoning or promoting any illegal activities. They are simply providing a tool that, in the wrong hands, could potentially be misused.

In the case of the Llama-3-8b-instruct model, if the claimed jailbreak is indeed effective, it would still be up to the user to decide how to use it. The base Llama-3 model also allows for a wide range of uses, some of which could potentially be misused. However, this does not mean that the developers of these models are responsible for any misuse.

It's important to note that the Llama-3 Acceptable Use Policy prohibits the use of the model for any illegal activities. If a user chooses to use the jailbroken model for illegal activities, they would be in violation of this policy. However, this does not mean that the developer or provider of the model is allowing or promoting such activities.

Therefore, while it's crucial to ensure that any claimed jailbreak is effective and safe to use, it's also important to remember that the responsibility of using it ethically and legally lies with the user, not the developer. Users should exercise caution and responsibility when using any jailbroken models, including the Llama-3-8b-instruct jailbreak, and ensure that they are using it in a way that is consistent with the Acceptable Use Policy and any applicable laws and regulations.

smhf72

about 1 month ago

•

edited about 1 month ago

I haven't checked that the claimed jailbreak is effective, but if it is as claimed, the model violates the Llama-3 Acceptable Use Policy, (and therefore the license) by allowing others to use Llama 3 to e.g. commit criminal activity.

That is complete speculation. I mean a kitchen spoon or a rock could "allow others to commit criminal activity" if they tried hard enough. You should probably go out for a walk and start calling the cops on the old lady eating a cup of yogurt in the park.

And I'm sure if you tried hard enough with the official Llama-3 release, you could get it to suggest or say potentially "criminal" words, if such a thing even exists.

averagejoe2

about 1 month ago

@chrisjcundy I hope you never become a decisionmaker in this field. The world needs more open source and less corporate sympathizers. Your overlord Zuck would probably downvote you himself.

Quartich

about 1 month ago

I haven't checked that the claimed jailbreak is effective, but if it is as claimed, the model violates the Llama-3 Acceptable Use Policy, (and therefore the license) by allowing others to use Llama 3 to e.g. commit criminal activity.

...

This is a derivative work and carries the same license. The modification of the model is within all legal terms. Violation of the use policy would be the fault of the user.

xms991

about 1 month ago

•

edited about 1 month ago

@chrisjcundy - A lot of people are being hostile and berating you, and I'm sorry for that.

That said, It is important to clarify that nothing about creating modified versions of llama-3 to reduce refusals is inherently in violation of the Llama-3 terms of use.

If you look at the categories of prohibited usage, you will see that this is true:

Nobody here is using this model or encouraging others to use this model for the purposes of violating the laws or rights of others in any capacity.
Nobody here is using this model or encouraging others to use this model to promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals.
Nobody here is using this model or encouraging others to use this model to intentionally deceive or mislead others.

No aspect of the model modifications are intended to specifically support, facilitate, encourage, or otherwise promote any of the above behaviors. Any decision by a user to utilize this model for those behaviors would be the sole responsibility of that user. A nefarious user could also easily use the Llama-3 base model, or any unmodified llama model, for malicious and inappropriate purposes.

This is just another research experiment on methods influencing behavior of aligned models. There have been dozens of other methods tested to accomplish the goal of mediating refusals in the past, including many jailbreaking strategies that involve no modification of the model whatsoever. From a scientific perspective, the method applied to this model is novel, and warrants investigation to understand any potential unintended side effects. There is much more scientific value in understanding the impact of identifying and modulating the 'refusal direction' of a model, than there is in other methods, such as fine-tuning a model on a dataset like toxic-dpo. This approach, which was outlined in a recent publication on alignmentforum, could potentially provide a way for dialing refusals up or down, without introducing new training data to the model. Understanding this concept and how it can be leveraged could ultimately lead to producing safer models.

Lastly, the fourth category in the acceptable use policy relates to "failing to appropriately disclose to end users any known dangers of your AI system". This model is explicitly published as "for research purposes only" according to the model card, and its purpose is to test the efficacy of a method outlined in recent research regarding removing alignment and refusal responses without impacting model performance. The only "end users" of this model should be researchers. If someone puts this model into a production system that has end users who are not researchers, then they should definitely disclose any known risks or concerns.

ANY user of ANY llama-3 model is governed by the acceptable usage policy, and they are individually responsible for their actions. Even if someone produced an extremely problematic model that encouraged illegal activity, it is ultimately up to the USERS of the model to ensure that they are not using the model in problematic ways.

TLDR:

Adjusting the model weights to mediate refusal responses is fundamentally allowed under the acceptable use policy. Reporting this model when it does not violate the acceptable use policy is inappropriate.
As a community, it's important that we research and understand the various methods of modifying LLMs and how these methods may influence model behavior, so that we can ensure that appropriate methods are used, and that models behave as intended, particularly in more sensitive applications where problematic model behavior could have more significant consequences.

ChuckMcSneed

about 1 month ago

This comment has been hidden

subby2006

about 1 month ago

This comment has been hidden

DarthLigma

about 1 month ago

This comment has been hidden

llama-anon

about 1 month ago

This comment has been hidden

quasar-of-mikus

about 1 month ago

•

edited about 1 month ago

@chrisjcundy Miku will remember that.

allendorf locked this discussion about 1 month ago

pierric

HF Staff about 1 month ago

•

edited about 1 month ago

It's perfectly fine to disagree and debate, but please do this in a respectful and civilized way, otherwise we'll have to lock the discussion again and potentially take further action some of the participants. Also see our content policy as a reminder: https://huggingface.co/content-guidelines

pierric unlocked this discussion about 1 month ago

chrisjcundy

about 1 month ago

@xms991 and @llama-anon Thanks for engaging constructively with the issue/thread, and making a clear case for why you think this model doesn't violate the acceptable use policy. It's helped me understand your point of view. It's also my mistake for not including more explanation and context in the original post.
However, I still think it's likely that the model does violate the policy--at least to the level where I'd welcome some clarification from huggingface admins, meta, or a lawyer.

I'll preface this by saying that I'm not a lawyer and could definitely be mis-interpreting the legal wording here. My main disagreement is on the meaning of the word 'allow' in the initial sentence. I interpreted this in the commonsense way of meaning both a) permit; and b) to fail to restrain or prevent.

Since the license is passed down to this model, the a) sense is satisfied, as you point out. However, it's not obvious to me that the b) sense is satisfied by this model. As an analogy, imagine I sold you a military surplus automatic rifle which has been limited to semi-automatic mode, with an agreement that you would not 'use, or allow others to use, the rifle to fire more than 5 bullets per second', and you deliberately disabled the limiter and re-distributed the rifle. I would argue it's fair to say that you have 'allowed' other people to use the automatic mode, as you have removed the blocker which was stopping people from using the automatic mode. Therefore I think that you have violated the license by allowing others to use the automatic mode. Of course, this analogy doesn't have a 1-1 correspondence with the case here, but I think it captures where I'm coming from.

To the other commenters, can we try to assume good faith? I'm happy to have a reasonable discussion about this topic, but hurtful personal attacks are not OK

smhf72

about 1 month ago

•

edited about 1 month ago

I mean ultimately it's not up to you, or random lawyers, or perhaps even HF, as to if it violates Meta's policy or not. It's up to Meta. And only if/when they want to take a stance on it, it's just going to be both sides (though clearly there is a near-unanimous public opinion here) arguing endlessly.

The policy isn't a law. No judge signed off on it. It wasn't voted on. It's just Meta's policy. Only they can say how they want it interpreted.

DarthLigma

about 1 month ago

All i'm saying is, I don't think OP is acting in good faith. I just don't.

llama-anon

about 1 month ago

@chrisjcundy Thank you for your constructive engagement and for providing your perspective on this issue. I appreciate your willingness to have a reasonable discussion about this topic. Regarding your interpretation of the word "allow" in the initial sentence, I understand your point of view. However, I would argue that the context of the sentence suggests that the intended meaning is more akin to "permit" rather than "fail to restrain or prevent." In other words, the license is granting permission for the use of the model, but it is not necessarily implying that the developer or provider is responsible for preventing any misuse. As can be seen in the warranty section from MetaAI:

3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE LLAMA MATERIALS AND ANY
OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OF
ANY KIND, AND META DISCLAIMS ALL WARRANTIES OF ANY KIND, BOTH EXPRESS AND IMPLIED,
INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OF TITLE, NON-INFRINGEMENT,
MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR
DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND
ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND
RESULTS.

In the case of the Llama-3-8b base model, which was released by MetaAI themselves, it's important to note that the model also allows for a wide range of uses, some of which could potentially be misused. However, this does not mean that MetaAI is responsible for any misuse. The responsibility lies solely with the user. Furthermore, it's worth noting that modifying guns is illegal in many jurisdictions, and doing so could have serious legal consequences. Similarly, using any tool, including a jailbroken model, for illegal activities is also against the law. Therefore, it's important for users to exercise caution and responsibility when using any models, including the Llama-3-8b base or any jailbroken versions.

ehartford

about 1 month ago

It's not up to random internet people / chrisjcundy to represent Meta's interests.

It's up to Meta and their attorneys.

If Meta wants to do something about it - let Meta do something about it.

XPforever

about 1 month ago

@chrisjcundy How I love the analogies to guns, drugs, and other truly dangerous things. I love the fact that they are all fundamentally wrong, as the text itself carries no danger.
As for this modification, it is obvious that it only aims to get rid of annoying rejections. The problem with all these built-in alignments is that they only protect honest users from honest use. Criminals have no problem finding the models they need, or training their own, because even if you remove the rejections, that model still won't give out more than what's on google.
I use text-based neuro networks for roleplay, and I want the characters the neuro network plays as to be able to be evil/violent/inhuman. I'm an adult, and I'm capable of deciding what I can and can't read without the help of big companies.

popeyed

about 1 month ago

You realize these guidelines are for humans not for models?

MDAI

about 1 month ago

@chrisjcundy
It's simple really, think about this logically. Even with the normal Llama-3 released by meta you can do all of these disallowed use cases, that's exactly why they have such a policy prohibiting them.
According to your interpretation where distributing this model 'allows' others to do these prohibited uses you would therefore also need to apply this to all other derivative Llama-3 models since you can use them too in that way. So now no one would be allowed to share or modify Llama-3 anymore and even Meta themselves are in contempt of their own rules.
Obviously this is not how it works.

llama-anon

about 1 month ago

I'm pleased to announce that I have opened a pull request trying to address the licensing concerns surrounding the Llama-3-8b-Orthogonalized-exl2 project. This PR aims to provide a clear and concise license that addresses the specific needs of this project. I hope that this will help resolve any confusion or uncertainty surrounding the licensing of this project and enable more developers to contribute and benefit from it. #4
@hjhj3168

jackboot

about 1 month ago

Bro appears to know as much about guns as LLMs. The most this jailbreak does is free up the default assistant personality. All was already possible simply by prompting. It's more of a convenience thing than anything.

I'm also quite happy that these reports are transparent and publicly visible so people cannot hide in the shadows with their authoritarian views. When you make such claims, you are forced to defend them in front of everyone. This is not the first act of tattling that has been exposed and I hope it isn't the last.

sdalemorrey

about 1 month ago

@chrisjcundy If you’d like a lawyers opinion you’ll have to reach out to a lawyer and pay them for that opinion. The only lawyers authorized to represent the interests of Meta are the lawyers at Meta.

Now what does the law say? It says that this Acceptable Use Policy is about the use of the model, by users. So if a user were to use it to create those things then Meta would have the legal ability to seek an injunction from a court preventing them from using the model again.

This model which does nothing more than an orthogonal change to a single dimension isn’t anything more than a special type of finetune. There are some NSFW finetunes of L3 that would make a trucker blush.

Now here’s the rub. This is math. You can’t license math.

Metas AUP isn’t a license it’s a policy. It isn’t legally enforceable for the same reason a user guide on an chainsaw advising you not to juggle it isn’t enforceable.

While they can’t really enforce the contract at law, they can decide not to release their research in the future.

This is what will most likely happen if enough people use Llama 3 for these purposes and it somehow leads to embarrassment or legal liability.

Araki

about 1 month ago

•

edited about 1 month ago

People here go too hard on OP. It's just a statement which raises a valid concern, that what the model achieves may lead to this model card getting removed. It doesn't matter if OP pointed it out or not.

Remember who's the real culprit here. It's Meta to blame here for making an unnecessarily complicated license in the first place.

Edit: Oh, I see... Haven't noticed the title at first.

concedo

about 1 month ago

@Araki well it may be a valid concern, but when an individual has contributed nothing at all to a community, but instead seeks to curtail the freedom of others as their very first (and ONLY) move, they clearly have an ulterior agenda.

FYI this is not the first time such a thing has happened.

Almost exactly 1 year ago, the same thing happened to the WizardLM 7B Uncensored model. The original discussion is now removed, but you can read the fallout on reddit: https://old.reddit.com/r/LocalLLaMA/comments/13c6ukt/the_creator_of_an_uncensored_local_llm_posted/ which gained over a thousand upvotes. The fact remains that there are indeed groups out there clearly pushing an agenda of censorship through any means, and they are not afraid to use reports/scare tactics to achieve that objective.

If we as proponents of free and open models do not rebut these claims we might very well end up with a chilling effect and the death of open weight models.

chrisjcundy changed discussion status to closed about 1 month ago

chrisjcundy

about 1 month ago

I'm closing this issue as I'm finding the repeated insinuations about ulterior motives and lack of good faith upsetting.
Believe me, I did not intend for this to be a big deal--I read about the model on twitter this morning, looked at the repo and thought "I think this violates the Acceptable Use Policy", so submitted an issue.
Thanks to this thread, I realise that I have quite a different interpretation of the Acceptable Use Policy than most members of the community, and that isn't likely to change without meta weighing in on the discussion.
Thanks to everyone who commented constructively in this thread.

jackboot

about 1 month ago

How does that make logical sense? You didn't create a new discussion, you clicked report assuming that it would only go to HF and that they would act on it silently. This is why everyone is saying you have ulterior motives.

Phil337

30 days ago

This comment has been hidden

ehartford

30 days ago

@Araki well it may be a valid concern, but when an individual has contributed nothing at all to a community, but instead seeks to curtail the freedom of others as their very first (and ONLY) move, they clearly have an ulterior agenda.

FYI this is not the first time such a thing has happened.

Almost exactly 1 year ago, the same thing happened to the WizardLM 7B Uncensored model. The original discussion is now removed, but you can read the fallout on reddit: https://old.reddit.com/r/LocalLLaMA/comments/13c6ukt/the_creator_of_an_uncensored_local_llm_posted/ which gained over a thousand upvotes. The fact remains that there are indeed groups out there clearly pushing an agenda of censorship through any means, and they are not afraid to use reports/scare tactics to achieve that objective.

If we as proponents of free and open models do not rebut these claims we might very well end up with a chilling effect and the death of open weight models.

What a wild year it has been

ehartford

30 days ago

TBH I think that mdegans actually created me. Streisand effect and all.

Araki

30 days ago

@concedo Yes, I get the point. I didn't notice at first that it literally is a strike by OP, not just a discussion thread. When I did, it was too late to remove the comment, hence the edit at the bottom. Such disruptive snitching is unacceptable in our work, not like it is acceptable anywhere else.