from transformers import AutoTokenizer
testtokenizer=AutoTokenizer.from_pretrained("LeoLM/leo-mistral-hessianai-7b-chat")
len(testtokenizer)
# 32002

Leads to e.g. VLLM error:
TypeError: argument 'tokens': 'NoneType' object cannot be converted to 'PyString'
(see here)

LAION LeoLM org

Have you tested this? The model's weights have 32128 embedding dim so I feel like this would break no?

Have you tested this? The model's weights have 32128 embedding dim so I feel like this would break no?

No I didn't test this and according to the docs you could be right see here .

Does it work with VLLM for you? See also example config.json from OpenOrca for comparison. Probably related to resize_token_embeddings_to_32x (but why its not 32032 then?) .

And seems to be an issue e.g. also here: https://github.com/huggingface/transformers/issues/4875

I have no idea what the right solution is or whether this is more a bug in VLLM; probably it would work to resize token embeddings after training again ( model.resize_token_embeddings(embeddings_len)) to get a match for usable vocab size and embeddings?

Feel free to close, just wanted to make you aware of this issue :).

LAION LeoLM org

I think the real solution is to 1. raise an issue with vLLM and hope they fix it or 2. add dummy tokens to the tokenizer. I resized the embeddings to a multiple of 128 since this is what is apparently most efficient on h100+ GPUs. Your idea of resizing back down might also be a good and easy solution. I don't think the speed loss should be too great.

I am trying to convert the model to gguf, llama.cpp complains about a Vocab size mismatch (model has 32128, but tokenizer.model has 32000). (I removed all from added_tokens.json. I can sure "fix" the vocab_size in the config which will eventually lead to an error loading the model: 'token_embd.weight' has wrong shape; expected 4096, 32000, got 4096, 32128,
Any ideas?

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment