rasyosef commited on
Commit
52c136b
1 Parent(s): 5b389e4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -7,7 +7,7 @@ language:
7
  library_name: transformers
8
  ---
9
  # Amharic BPE Tokenizer
10
- This repo contains a **Byte-Pair Encoding** tokenizer trained on the **Amharic** subset of the [oscar](https://huggingface.co/datasets/oscar) dataset. It's the same as the GPT-2 tokenizer but trained from scratch on an amharic dataset with a **vocabulary size** of `24000`.
11
 
12
  # How to use
13
  You can load the tokenizer from huggingface hub as follows.
 
7
  library_name: transformers
8
  ---
9
  # Amharic BPE Tokenizer
10
+ This repo contains a **Byte-Pair Encoding** tokenizer trained on the **Amharic** subset of the [oscar](https://huggingface.co/datasets/oscar) dataset. It's the same as the GPT-2 tokenizer but trained from scratch on an amharic dataset with a vocabulary size of `24000`.
11
 
12
  # How to use
13
  You can load the tokenizer from huggingface hub as follows.