License and training data

#57
by Shitqq - opened

Do you consider changing the license to open license and sharing the training data? Calling llama 3 open source while having a bespoke commercial license is a joke. While it is good to be more open than other proprietary models, this is far from being fully open (No training data, training code, custom license etc.).

Transparency and accurate representation of AI models openness are crucial. However, a concerning trend has emerged, with certain models claiming to be "open" despite falling short of openness standards due to restricted access to training data, training code, the use of custom licenses etc.

Meta seems to be open-washing, llama 3 has been marketed as "open" but this claim is misleading. While it offers some level of transparency, there are significant concerns:

Restricted Training Data Access: The developers have not provided full access to training data, which is crucial for understanding the model's limitations and potential biases.

Custom License Restrictions: This is incompatible with open-source principles and hinders collaboration and improvement.

The implications of these misleading openness claims could lead to false sense of transparency, potentially misleading users about the model's development and vetting processes.

Please fix issue by doing the following:

-Standard Open Licenses: adopt widely recognized open-source licenses.

-Transparency in Training Data: Full disclosure of training data, along with documentation, is essential for understanding the model's foundation.

This comment has been hidden

Sign up or log in to comment