xu-song's picture
add more tokenizers
f4973d4
raw
history blame
106 Bytes
from vocab.baichuan import tokenizer
id1 = tokenizer.encode("<pad>")
token1 = tokenizer.decode(125696)