riotu-lab
/

ArabianGPT-03B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ArabianGPT-03B / README.md

riotu-lab's picture

Update readme.md

9fd9071 verified 9 months ago

|

3.23 kB

	---
	license: apache-2.0
	language:
	- ar
	pipeline_tag: text-generation
	tags:
	- 'arabic '
	- text-generation
	widget:
	- text: "أعلنت وزارة الحج في المملكة العربية السعودية"
	example_title: "مثال ١"
	- text: "يبدو اليوم جميلا، سأقوم بتحضير"
	example_title: "مثال ٢"
	- text: "إن التقنيات الحديثة"
	example_title: "مثال ٣"
	---
	# ArabianGPT Model Overview

	## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation

	<p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.3B, and users engage with and apply the model's outputs at their own risk.</p>

	> Important Note: Currently, we offer a raw pre-trained model. Our team is actively working on releasing instruction-based LLMs that are fine-tuned and augmented with LRHF. The first set of pre-trained models has been made available for community exploration. While we do have models fine-tuned for specific tasks such as summarization and sentiment analysis, they are still in the development phase.


	## Introduction
	ArabianGPT-0.3B, developed under the ArabianLLM initiatives, is a specialized GPT-2 model optimized for Arabic language modeling.
	It's a product of the collaborative efforts at Prince Sultan University's Robotics and Internet of Things Lab, focusing on enhancing natural language modeling and generation in Arabic.
	This model represents a significant stride in LLM research, specifically addressing the linguistic complexities and nuances of the Arabic language.

	## Key Features
	- Architecture: GPT-2
	- Model Size: 345 million parameters
	- Layers: 24
	- Model Attention Layers (MAL): 16
	- Context Window Size: 1024 tokens

	## Training
	- Dataset: Scraped texts contains scientific articles, and general texts
	- Data Size: 23 GB
	- Tokenizer: Aranizer 64K
	- Tokens: Over 3.3 billion
	- Hardware: 4 NDIVIA A100 GPUs
	- Training Duration: 45 days
	- Performance: loss of 3.82


	## Role in ArabianLLM Initiatives
	ArabianGPT-0.3B is crucial for advancing Arabic language processing, addressing challenges unique to Arabic morphology and dialects.

	## Usage
	Suitable for Arabic text generation tasks. Example usage with Transformers Pipeline:
	```python
	from transformers import pipeline

	pipe = pipeline("text-generation", model="riotu-lab/ArabianGPT-03B", max_new_tokens=512)
	text = ''
	pipe.predict(text)
	```

	## Limitations and Ethical Considerations

	- The model may have context understanding or text generation limitations in certain scenarios.
	- Emphasis on ethical use to prevent misinformation or harmful content propagation.

	## Acknowledgments

	Special thanks to Prince Sultan University, particularly the Robotics and Internet of Things Lab.

	## Contact Information

	For inquiries: [riotu@psu.edu.sa](mailto:riotu@psu.edu.sa).

	## Disclaimer for the Use of Large Language Models (LLMs) for Text Generation

	<p style="color: red;">We disclaim all responsibility for any harm, inaccuracies, or inappropriate content generated by ArabianGPT-0.3B, and users engage with and apply the model's outputs at their own risk.</p>