faisalq commited on
Commit
418b724
1 Parent(s): 84d397a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -1,10 +1,5 @@
1
  ---
2
  license: cc-by-nc-4.0
3
- language:
4
- - ar
5
- ---
6
-
7
- ---
8
  language:
9
  - ar
10
  tags:
@@ -17,6 +12,10 @@ widget:
17
 
18
  ---
19
 
 
 
 
 
20
 
21
  **SaudiBERT** is the first pre-trained large language model focused exclusively on Saudi dialect text. The model was pretrained on two large-scale corpora: the Saudi Tweets Mega Corpus (STMC), which contains +141 million tweets, and the Saudi Forum Corpus, which includes +70 million sentences collected from various Saudi online forums. The datasets comprise **26.3GB of text**. The code files along with the results are available on [repo](https://github.com/FaisalQarah/SaudiBERT).
22
 
 
1
  ---
2
  license: cc-by-nc-4.0
 
 
 
 
 
3
  language:
4
  - ar
5
  tags:
 
12
 
13
  ---
14
 
15
+ ---
16
+
17
+ ---
18
+
19
 
20
  **SaudiBERT** is the first pre-trained large language model focused exclusively on Saudi dialect text. The model was pretrained on two large-scale corpora: the Saudi Tweets Mega Corpus (STMC), which contains +141 million tweets, and the Saudi Forum Corpus, which includes +70 million sentences collected from various Saudi online forums. The datasets comprise **26.3GB of text**. The code files along with the results are available on [repo](https://github.com/FaisalQarah/SaudiBERT).
21