Spaces:
Running
Running
Is there any documentation within this leaderboard?
#1
by
zhiminy
- opened
I cannot locate any specification in this space...What is this leaderboard used for?
Thanks for your attention. A brief document has been added to the demo.
The leaderboard aims to evaluate tokenizer performance on different languages.
- Lower
oov_ratio
refers to less out-of-vocabulary tokens. - Higher
char/token
means less words be segmented into subwords.
xu-song
changed discussion status to
closed