Gül Sena Altıntaş
Merge branch 'main' of https://huggingface.co/spaces/toksuite/quick-tokenizer-accuracy
b1c588a

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: Quick Tokenizer Accuracy
emoji: 🐢
colorFrom: red
colorTo: purple
sdk: gradio
sdk_version: 5.37.0
app_file: app.py
pinned: false
license: apache-2.0

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

uv venv --python 3.10
source .venv/bin/activate
uv pip install -r requirements.txt

Citation

If you use this space, please cite:

@article{toksuite2025, title={TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior}, author={Altıntaş, Gul Sena and Ehghaghi, Malikeh and Lester, Brian and Liu, Fengyuan and Zhao, Wanru and Ciccone, Marco and Raffel, Colin}, year={2025}, arxiv={arxiv.org/abs/2512.20757} }