Spaces:
Running
Running
## vocabsize不一致问题 | |
- .vcab_size | |
- Size of the base vocabulary (without the added tokens) | |
- 来自 https://huggingface.co/transformers/v2.11.0/main_classes/tokenizer.html | |
- len(tokenizer) | |
- Size of the full vocabulary with the added tokens. | |
- https://github.com/huggingface/transformers/issues/12632 | |
- max(tokenizer.get_vocab().values()) | |
- 包括不连续的 token_id | |
- https://github.com/huggingface/transformers/issues/4875 | |