Can this be trained?
- opened
Just like any other Hugging Face tokenizer, it does seem like this tokenizer can be trained, however, I just wanted to check whether there were any caveats to this implementation that mean that if one were to try to train this tokenizer, say, on the exact same dataset used to create the gpt4o tokenizer, you'd still end up with a different tokenizer.