Added flag
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ tags: []
|
|
6 |
# culturay_el_32000
|
7 |
|
8 |
## About
|
9 |
-
|
10 |
|
11 |
## Description
|
12 |
This is a **character-level** Modern Greek (el) tokenizer, trained on the corresponding subset of CulturaY. It has a vocabulary size of 32,000 ([multiple of 128](https://x.com/karpathy/status/1621578354024677377)), which makes it fast for integration in models.
|
|
|
6 |
# culturay_el_32000
|
7 |
|
8 |
## About
|
9 |
+
🇬🇷 A Greek tokenizer, trained on the Greek (el) subset of the [CulturaY dataset](https://huggingface.co/datasets/ontocord/CulturaY).
|
10 |
|
11 |
## Description
|
12 |
This is a **character-level** Modern Greek (el) tokenizer, trained on the corresponding subset of CulturaY. It has a vocabulary size of 32,000 ([multiple of 128](https://x.com/karpathy/status/1621578354024677377)), which makes it fast for integration in models.
|