ππ₯³π Today, we are thrilled to officially launch the "2A2I" Arabic Artificial Intelligence Initiative. This is a community-driven initiative founded on the philosophy of "Small team, Big work" Our goal is to elevate Arabic AI (LLMs, Diffusion Models, ASR, etc.) to the same level as English (and also Chinese π).
Naturally, our focus today is primarily on datasets. We aim to provide high-quality datasets, especially for LLMs this month, to support our future efforts. In line with this, we're excited to introduce the Arabic version of H4-no_robots, find here : 2A2I/H4_no_robots (and yes, we know it's not "no_robots" anymore π). Stay tuned for more exciting, high-quality datasets in the next couple of weeks (+ 4 million rowsπ₯)
In parallel, we're also developing a model πͺ that we hope will set new high standards for Arabic LLMs. π₯ This model is planned for release in the coming months.
If you're interested in Arabic AI and want to help pushing the wheel as well, fill out this form, and let us know your motivation and your exciting ideas π₯
If you have any questions, feel free to reach out to us at the email address below.
Additionally, if you believe as we do in this mission and would like to help this community and contribute some compute resources π or any other form of help you might think about, please contact us at the same email address below or reach out to me through LinkedIn π₯