Where to find the token ids of the tokenizer ?

#22
by Mohamed123321 - opened

Hello,
I was wondering how can I access and change the tokenizer's token ids ?
Thanks !

Google org

I may add that I speak about the mapping from tokens (part of words) and ids

Google org

Hey! The tokenizer by default is based on sentencepiece. You can't really change it but you can add tokens using add_tokens and see the vocab using tokenizer.get_vocab()

Sign up or log in to comment