riotu-lab commited on
Commit
c10c255
1 Parent(s): 1000026

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -0
README.md CHANGED
@@ -1,3 +1,58 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - ar
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - 'arabic '
8
+ - text-generation
9
  ---
10
+ # ArabianGPT Model Overview
11
+
12
+ ## Introduction
13
+ ArabianGPT-0.3B, developed under the ArabianLLM initiatives, is a specialized GPT-2 model optimized for Arabic language modeling.
14
+ It's a product of the collaborative efforts at Prince Sultan University's Robotics and Internet of Things Lab, focusing on enhancing natural language modeling and generation in Arabic.
15
+ This model represents a significant stride in LLM research, specifically addressing the linguistic complexities and nuances of the Arabic language.
16
+
17
+ ## Key Features
18
+ - **Architecture**: GPT-2
19
+ - **Model Size**: 345 million parameters
20
+ - **Layers**: 24
21
+ - **Model Attention Layers (MAL)**: 16
22
+ - **Context Window Size**: 1024 tokens
23
+
24
+ ## Training
25
+ - **Dataset**: C4, Twitter, Wiki
26
+ - **Data Size**: 23 GB
27
+ - **Tokenizer**: Aranizer 64K
28
+ - **Tokens**: Over 3.3 billion
29
+ - **Hardware**: 4 NDIVIA A100 GPUs
30
+ - **Training Duration**: 45 days
31
+ - **Performance**: loss of 3.88
32
+
33
+
34
+ ## Role in ArabianLLM Initiatives
35
+ ArabianGPT-0.3B is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects.
36
+
37
+ ## Usage
38
+ Suitable for Arabic text generation tasks. Example usage with Transformers Pipeline:
39
+ ```python
40
+ from transformers import pipeline
41
+
42
+ pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-03B", max_new_tokens=512)
43
+ text = ''
44
+ pipe.predict(text)
45
+ ```
46
+
47
+ ## Limitations and Ethical Considerations
48
+
49
+ - The model may have context understanding or text generation limitations in certain scenarios.
50
+ - Emphasis on ethical use to prevent misinformation or harmful content propagation.
51
+
52
+ ## Acknowledgments
53
+
54
+ Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab.
55
+
56
+ ## Contact Information
57
+
58
+ For inquiries: [riotu@psu.edu.sa](mailto:riotu@psu.edu.sa).