Why does the config show this is a LLaMA model?

Hello, mistral and llama have the same model structure, so I used the script of llama to convert the mistral ckpt from deepseek framework to huggingface framework.

There's no problem with the model type. LlamaForCausalLM is ok.

The bos and eos id does have a bit of a problem though.

It should be 1 and 2, not 100000 and 100001.

I've changed it.

However, huggingface's generate function should use tokenizer bos and eos id by default, without using 100000 and 100001, so it has no effect.

tongyx361

Jan 15

Thanks for your patient explanation!

tongyx361 changed discussion status to closed Jan 15

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment