pino-bigbird-roberta-base / create_pt_tokenizer.py
dat
add pt tokenizer
7398222
raw
history blame
188 Bytes
from transformers import AutoTokenizer, AddedToken
tokenizer = AutoTokenizer.from_pretrained("./")
tokenizer.mask_token = AddedToken("<mask>", lstrip=True)
tokenizer.save_pretrained("./")