Why does the config show this is a LLaMA model?

#1
by tongyx361 - opened

The tokenizer is also a LLaMA tokenizer...

https://huggingface.co/peiyi9979/math-shepherd-mistral-7b-prm/blob/main/config.json

But the model config shows that the model's architecture is LlamaForCausalLM?

Hello, mistral and llama have the same model structure, so I used the script of llama to convert the mistral ckpt from deepseek framework to huggingface framework.

There's no problem with the model type. LlamaForCausalLM is ok.

The bos and eos id does have a bit of a problem though.

It should be 1 and 2, not 100000 and 100001.

I've changed it.

However, huggingface's generate function should use tokenizer bos and eos id by default, without using 100000 and 100001, so it has no effect.

Thanks for your patient explanation!

tongyx361 changed discussion status to closed

Sign up or log in to comment