File size: 1,428 Bytes
0aa3511 dc783ab 0aa3511 3626e28 0aa3511 dc783ab 0aa3511 dc783ab 0aa3511 dc783ab 0aa3511 dc783ab 0aa3511 dc783ab 0aa3511 dc783ab 0aa3511 dc783ab 0aa3511 dc783ab 0aa3511 dc783ab |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
---
datasets:
- AfnanTS/Final_ArLAMA_DS_tokenized_for_ARBERTv2
language:
- ar
base_model:
- UBC-NLP/ARBERTv2
pipeline_tag: fill-mask
---
<img src="./arab_icon2.png" alt="Model Logo" width="30%" height="30%" align="right"/>
**ARBERTv2_ArLAMA** is a transformer-based Arabic language model fine-tuned on Masked Language Modeling (MLM) tasks. The model uses Knowledge Graphs (KGs) to enhance its understanding of semantic relations and improve its performance in various Arabic NLP tasks.
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
Filling masked tokens in Arabic text, particularly in contexts enriched with knowledge from KGs.
### Downstream Use
Can be further fine-tuned for Arabic NLP tasks that require semantic understanding, such as text classification or question answering.
## How to Get Started with the Model
```python
from transformers import pipeline
fill_mask = pipeline("fill-mask", model="AfnanTS/ARBERTv2_ArLAMA")
fill_mask("اللغة [MASK] مهمة جدا."
```
## Training Details
### Training Data
Trained on the ArLAMA dataset, which is designed to represent Knowledge Graphs in natural language.
### Training Procedure
Continued pre-training of ArBERTv2 using Masked Language Modeling (MLM) tasks, integrating structured knowledge from Knowledge Graphs. |