update readme.md
Browse files
README.md
CHANGED
@@ -18,7 +18,15 @@ tags:
|
|
18 |
- **Vocabulary Size**: 32,000
|
19 |
- **Total Number of Tokens**: 1,520,791
|
20 |
- **Fertility Score**: 1.975
|
21 |
-
- It supports Arabic Diacritization
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
|
23 |
## How to Use the Aranizer Tokenizer
|
24 |
|
|
|
18 |
- **Vocabulary Size**: 32,000
|
19 |
- **Total Number of Tokens**: 1,520,791
|
20 |
- **Fertility Score**: 1.975
|
21 |
+
- It supports Arabic Diacritization
|
22 |
+
-
|
23 |
+
## Aranizer Collection Achieved State of the Art Arabic Tokenizer
|
24 |
+
|
25 |
+
The Aranizer tokenizer has achieved state-of-the-art results on the [Arabic Tokenizers Leaderboard](https://huggingface.co/spaces/MohamedRashad/arabic-tokenizers-leaderboard) on Hugging Face. Below is a screenshot highlighting this achievement:
|
26 |
+
|
27 |
+
<img src="./lb.png" alt="Screenshot showing the Aranizer Tokenizer achieving state of the art" width="800">
|
28 |
+
|
29 |
+
|
30 |
|
31 |
## How to Use the Aranizer Tokenizer
|
32 |
|