isrouush commited on
Commit
aee9afe
1 Parent(s): 1e4eb38

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -70,6 +70,8 @@ print(response)
70
 
71
  ### **Training's computational requirements**
72
 
 
 
73
  ### **Dataset**
74
 
75
  To ensure the diversity of data points and satisfy our purpose of instruction-tuning, we collected, labeled, filtered, and reviewed a set of datasets, each tailored to specific instruction types.
@@ -84,6 +86,17 @@ Noting that all the datasets are in Arabic, they comprise:
84
 
85
  The full dataset adds up to over **110K** records.
86
 
 
 
 
 
 
 
 
 
 
 
 
87
  ### **Disclaimer**
88
 
89
  The generated responses from this AI model are purely algorithmic and should be interpreted with caution. The model's outputs may occasionally exhibit bias, offensive language, or potentially harmful content. It is important to note that these responses do not reflect the personal preferences or viewpoints of the authors or the organization of Naseej.
 
70
 
71
  ### **Training's computational requirements**
72
 
73
+ Noon-7b was trained on 8-A100 GPUs using Distributed multi-GPU training via the [ColossalAI](https://github.com/hpcaitech/ColossalAI) framework.
74
+
75
  ### **Dataset**
76
 
77
  To ensure the diversity of data points and satisfy our purpose of instruction-tuning, we collected, labeled, filtered, and reviewed a set of datasets, each tailored to specific instruction types.
 
86
 
87
  The full dataset adds up to over **110K** records.
88
 
89
+ ### **Evaluation**
90
+
91
+ Throughout a set of over 4000 Arabic data samples, Noon-7b was automatically evaluated using **OpenAI's [GPT3.5 Turbo](https://platform.openai.com/docs/models)** model.
92
+
93
+ Provided with clear and carefully crafted evaluation criteria (aligning with the model's training objective as well as the syntactic and grammatical rules of the Arabic language), GPT3.5 Turbo was prompted to evaluate each of Noon's responses to an input instruction on a scale of **1 - 5**.
94
+
95
+ We concluded the evaluation by averaging the provided scores, adding up to an impressive final score of **4.07/5**.
96
+
97
+ **NOTE:** Although we acknowledge that this proposed framework is not an exact solution and that it remains an ongoing area of research, we hold the belief that it has the potential to replicate human assessments to a reasonably satisfactory extent.
98
+
99
+
100
  ### **Disclaimer**
101
 
102
  The generated responses from this AI model are purely algorithmic and should be interpreted with caution. The model's outputs may occasionally exhibit bias, offensive language, or potentially harmful content. It is important to note that these responses do not reflect the personal preferences or viewpoints of the authors or the organization of Naseej.