Can the model be used for commercial purposes?

#11
by AayushShah - opened

This is the first time I am integrating some hugging face models for the commercial use. I am not sure whether it is allowed. I am aware that these models are open source but still if there is some dark side to this, will anyone please enlighten me?

Thank you 🤗

The training data lists alpaca which was created using OpenAI's APIs. Depending on your lawyer's view, this may or may not be acceptable. OpenAI's TOS deny any use of generated data for use in a competitive product. Your lawyer will need to sort that out. I am not a lawyer. I am not your lawyer. This is not legal advice.

I agree with the Law being reviewed by the lawyers, however, I am not sure that statement is 100% true for OpenAI's ChatGPT:

(a) Your Content. You may provide input to the Services (“Input”), and receive output generated and returned by the Services based on the Input (“Output”). Input and Output are collectively “Content.” As between the parties and to the extent permitted by applicable law, you own all Input. Subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output. This means you can use Content for any purpose, including commercial purposes such as sale or publication, if you comply with these Terms. OpenAI may use Content to provide and maintain the Services, comply with applicable law, and enforce our policies. You are responsible for Content, including for ensuring that it does not violate any applicable law or these Terms. - https://openai.com/policies/terms-of-use

Now, the people who curated the dataset for Alpaca can say don't use our data for any commercial purposes and they have every right to do so. (this TOS says they own the content, both input and the output, so they get to decide what to do with it)

The same page says:

Restrictions. You may not ... use output from the Services to develop models that compete with OpenAI;

If you want to try a clean open-source model, you can try H2O.ai's 12-20B: https://github.com/h2oai/h2ogpt

We only use OIG + conversational OASST data, no alpaca as all these open assistant models use.

@pseudotensor you are seemingly pointing to the correct direction. I think:

Restrictions. You may not ... use output from the Services to develop models that compete with OpenAI;

Answers the original question. I also think https://huggingface.co/mosaicml/mpt-7b-chat is not available for the commercial use because of the same reason :)

Sign up or log in to comment