Commit History

Add Llama tokenizer creation for Dutch, English, Code, Markdown and TeX.
c78da21

yhavinga commited on

Pin protobuf dep
7d83c88

yhavinga commited on

Hack some Dutch tokenizers into it
55df72d

yhavinga commited on

update
f331792

xu-song commited on

update
9558ae0

xu-song commited on

add compression leaderboard
1b7fc74

xu-song commited on

update compress rate
367a536

xu-song commited on

update compress rate
988921c

xu-song commited on

update
7d2062e

xu-song commited on

add grok mixtral
480ae5d

xu-song commited on

add compress rate
814ee6b

xu-song commited on

add zephyr
a6aee1d

xu-song commited on

requi
6b70021

xu-song commited on

requi
510279b

xu-song commited on

requi
7b522e7

xu-song commited on

config python version
11379e2

xu-song commited on

add xlm-roberta
057bc67

xu-song commited on

add amber and crystal_coder
5db13e0

xu-song commited on

add character glm
f0f84b2

xu-song commited on

update
f02dd94

xu-song commited on

fix PyO3PanicException
2461705

xu-song commited on

fix unicode error: 'unicodeescape' codec can't decode bytes in position 602-608: unknown Unicode character name
bce41d0

xu-song commited on

fix fastchat_t5_3b
c766a08

xu-song commited on

fix tiktoken special tokens
adcfb97

xu-song commited on

add aya
44c3329

xu-song commited on

fix olmo
2442c83

xu-song commited on

add olmo tokenizer
bbefe94

xu-song commited on

update
24b4aa5

xu-song commited on

update
1f833af

xu-song commited on

fix tiktoken
a6c67ec

xu-song commited on

fix gemma_7b
7011963

xu-song commited on

add gemma_7b
9c8ace5

xu-song commited on

add more tokenizer
5425d5d

xu-song commited on

fix tokenize
e6543ac

xu-song commited on

add more tokenizer
c75633b

xu-song commited on

update
6bdf6c6

xu-song commited on

update
9820e00

xu-song commited on

fix chatglm; new feature about add_special_tokens;
d27a756

xu-song commited on

update
a37f943

xu-song commited on

Merge branch 'main' of hf.co:spaces/eson/tokenizer-arena
0415b36

xu-song commited on

add more tokenizer
d2417c7

xu-song commited on

Update README.md
fab95c3

xu song commited on

update
ae282a4

xu-song commited on

add more tokenizers
a1b0cd0

xu-song commited on

add more tokenizer
3030d21

xu-song commited on

add more tokenizer
293bad6

xu-song commited on

fix moss
aa0c637

xu-song commited on

update
da93e39

xu-song commited on

update
2d550af

xu-song commited on

add skywork
c7ed4a2

xu-song commited on