kobert / tokenizer_config.json
beomi's picture
use BERT wordpiece Tokenizer
372ec67
{
"do_lower_case": true,
"do_basic_tokenize": false,
"unk_token": "[UNK]",
"sep_token": "[SEP]",
"pad_token": "[PAD]",
"cls_token": "[CLS]",
"mask_token": "[MASK]"
}