tokenizer-arena / README.md
eson's picture
requi
7b522e7
|
raw
history blame
No virus
493 Bytes
metadata
title: Tokenizer Arena
emoji: 
colorFrom: red
colorTo: gray
sdk: gradio
sdk_version: 3.41.2
app_file: app.py
pinned: false

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

ss

TODO

  • 搜索栏

统计

vocabsize

  • 增大能提到压缩率,副作用是增大计算量和内存 (getting the most out of your tokenizer for pre-training and)

https://huggingface.co/spaces/yenniejun/tokenizers-languages