Naseej
/

noon-7b

Text Generation

question-answering

Inference Endpoints

text-generation-inference

Model card Files Files and versions Community

mobarmg commited on Jun 18, 2023

Commit

d1c468b

•

1 Parent(s): f51cac5

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -19,6 +19,7 @@ widget:
   example_title: Question 2
 ---
 ### **Noon - a 7-billion parameter Arabic Large Language Model**
 We present the 7-billion parameter variant of **Noon**, an Arabic Large Language model based on **BLOOM**, a foundation model released by the [bigscience](https://huggingface.co/bigscience) workshop.
@@ -29,6 +30,8 @@ We trained the model using the ColossalAI framework which fully supports the Hug
 The training data is a combination of Arabic datasets covering multiple tasks, more details are provided in the dataset section.
 مرحبًا بكم في بطاقة نموذج "نون"!
  يحتوي "نون" على أكثر من 7 مليار عامل متغير، مما يجعله أكبر نموذج للغة العربية المطروح حتى الآن. تم تدريب هذا النموذج على أكثر من 110,000 سجل بيانات باللغة العربية، والتي تغطي أكثر من 11 ملايين كلمة، تتنوع ما بين إنتاج النصوص، وإنشاء الشفرات، وحل المسائل الرياضية، والأسئلة المغلقة/المفتوحة. تم تدريب هذا النموذج باستخدام تقنيات تدريب متقدمة مثل التدريب الموزع على عدة وحدات معالجة رسومية، وتكييف LoRA (Low Rank Adaptation)، وتحسين ZeRO (Zero Redundancy Optimization).

   example_title: Question 2
 ---
 ### **Noon - a 7-billion parameter Arabic Large Language Model**
 We present the 7-billion parameter variant of **Noon**, an Arabic Large Language model based on **BLOOM**, a foundation model released by the [bigscience](https://huggingface.co/bigscience) workshop.
 The training data is a combination of Arabic datasets covering multiple tasks, more details are provided in the dataset section.
+<img src="https://i.ibb.co/H4cztJj/noonllm.png"  width="20%" height="20%">
 مرحبًا بكم في بطاقة نموذج "نون"!
  يحتوي "نون" على أكثر من 7 مليار عامل متغير، مما يجعله أكبر نموذج للغة العربية المطروح حتى الآن. تم تدريب هذا النموذج على أكثر من 110,000 سجل بيانات باللغة العربية، والتي تغطي أكثر من 11 ملايين كلمة، تتنوع ما بين إنتاج النصوص، وإنشاء الشفرات، وحل المسائل الرياضية، والأسئلة المغلقة/المفتوحة. تم تدريب هذا النموذج باستخدام تقنيات تدريب متقدمة مثل التدريب الموزع على عدة وحدات معالجة رسومية، وتكييف LoRA (Low Rank Adaptation)، وتحسين ZeRO (Zero Redundancy Optimization).