Amharic Tokenizer
Model Details
- Vocabulary Size: 100,000
- Tokenizer Type: Byte-Pair Encoder
Model Description
- Developed by: Biniyam Ajaw
- Language(s) (NLP): Amharic and Amharic-Driven Languages
- License: MIT
Model Sources [optional]
Uses
Model can be called by the autotokenizer module from the transformers package and can be used to tokenize any amharic text perfectly
Unable to determine this model's library. Check the
docs
.