Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
sail
/
scaling-with-vocab-trained-tokenizers
like
2
Follow
Sea AI Lab
72
Model card
Files
Files and versions
Community
bdb9681
scaling-with-vocab-trained-tokenizers
/
README.md
tcftrees
Create README.md
47d7788
verified
7 months ago
preview
code
|
raw
Copy download link
history
blame
Safe
145 Bytes
The trained BPE tokenziers with various vocabulary sizes, which uses to study how the vocabulary size affects the performance of language models.