Better Tokenization (Uses the correct token for padding)

#6

Created a better tokenizer from sentencepiece model. please test it.

Thank you! I have confirmed this fixed the issue with padding.

jbochi changed pull request status to merged
jbochi changed pull request title from Better Tokenization to Better Tokenization (Uses the correct token for padding)

Sign up or log in to comment