Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Spaces:

eson
/

tokenizer-arena

Running

App Files Files Community

tokenizer-arena / vocab /glm_chinese

1 contributor

History: 2 commits

eson's picture

add compress rate

814ee6b 2 months ago

chinese_sentencepiece
update 10 months ago
README.md

487 Bytes

update 10 months ago
__init__.py

1.34 kB

add compress rate 2 months ago
convert_vocab_to_txt.py

689 Bytes

update 10 months ago
file_utils.py

8.38 kB

update 10 months ago
glm_chinese.vocab.txt

659 kB

update 10 months ago
sp_tokenizer.py

4.67 kB

update 10 months ago
test.py

115 Bytes

add compress rate 2 months ago
test_glm.py

2.5 kB

update 10 months ago
tokenization.py

51.9 kB

update 10 months ago
tokenization_gpt2.py

13.5 kB

update 10 months ago
utils.py

213 Bytes

update 10 months ago
wordpiece.py

15.5 kB

update 10 months ago