Question about tokenizer

by LukeYang - opened Jun 3, 2022

Jun 3, 2022

May I ask how did you generate the .spm file for the MarianTokenizer? I'm trying to train a google/sentencepiece model, but it only returns the .model and .vocab file. Should I convert them manually? or is there any other methods.

LukeYang changed discussion status to closed Jun 6, 2022

LukeYang changed discussion status to open Jun 6, 2022

LukeYang changed discussion status to closed Jun 14, 2022

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment