Commit History

Add Llama tokenizer creation for Dutch, English, Code, Markdown and TeX.
c78da21

yhavinga commited on

Pin protobuf dep
7d83c88

yhavinga commited on

Hack some Dutch tokenizers into it
55df72d

yhavinga commited on

update
f331792

eson commited on

update
9558ae0

eson commited on

add compression leaderboard
1b7fc74

eson commited on

update compress rate
367a536

eson commited on

update compress rate
988921c

eson commited on

update
7d2062e

eson commited on

add grok mixtral
480ae5d

eson commited on

add compress rate
814ee6b

eson commited on

add zephyr
a6aee1d

eson commited on

requi
6b70021

eson commited on

requi
510279b

eson commited on

requi
7b522e7

eson commited on

config python version
11379e2

eson commited on

add xlm-roberta
057bc67

eson commited on

add amber and crystal_coder
5db13e0

eson commited on

add character glm
f0f84b2

eson commited on

update
f02dd94

eson commited on

fix PyO3PanicException
2461705

eson commited on

fix unicode error: 'unicodeescape' codec can't decode bytes in position 602-608: unknown Unicode character name
bce41d0

eson commited on

fix fastchat_t5_3b
c766a08

eson commited on

fix tiktoken special tokens
adcfb97

eson commited on

add aya
44c3329

eson commited on

fix olmo
2442c83

eson commited on

add olmo tokenizer
bbefe94

eson commited on

update
24b4aa5

eson commited on

update
1f833af

eson commited on

fix tiktoken
a6c67ec

eson commited on

fix gemma_7b
7011963

eson commited on

add gemma_7b
9c8ace5

eson commited on

add more tokenizer
5425d5d

eson commited on

fix tokenize
e6543ac

eson commited on

add more tokenizer
c75633b

eson commited on

update
6bdf6c6

eson commited on

update
9820e00

eson commited on

fix chatglm; new feature about add_special_tokens;
d27a756

eson commited on

update
a37f943

eson commited on

Merge branch 'main' of hf.co:spaces/eson/tokenizer-arena
0415b36

eson commited on

add more tokenizer
d2417c7

eson commited on

Update README.md
fab95c3

eson commited on

update
ae282a4

eson commited on

add more tokenizers
a1b0cd0

eson commited on

add more tokenizer
3030d21

eson commited on

add more tokenizer
293bad6

eson commited on

fix moss
aa0c637

eson commited on

update
da93e39

eson commited on

update
2d550af

eson commited on

add skywork
c7ed4a2

eson commited on