how did you convert `transformers.PreTrainedTokenizer` to ggml format?
#2
by
keunwoochoi
- opened
can you share how you did it? i am trying to use my custom language model to ggml. but i also use a tokenizers.Tokenizer
that i trained on my corpus.
i could get merges.txt
and vocab.json
, but idk how i can convert it to tokenizer.model
file, which seems like the only format the ggml converter is compatible with.
thanks!
You need to add support of your model architecture into ggml - see https://github.com/ggerganov/ggml/tree/master/examples
There is no magical recipe. You also can see https://github.com/OpenNMT/CTranslate2 as an alternative.