Asking about `tokenizer.model` file

by bibekyess - opened Oct 23, 2023

Oct 23, 2023

Hi! I was wondering how did you obtain tokenizer.model file? In other available korean base llama models like beomi/llama-2-ko-7b, they used FastTokenizer provided by HF tokenizers not the sentencepiece package, so the output doesn't contain tokenizer.model. Can you share how you obtained tokenizer.model file from tokenizer.json?
Thank you for your help!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment