RuntimeError: The size of tensor a (32000) must match the size of tensor b (32001) at non-singleton dimension 0

by jarodwu - opened Apr 20, 2023

Discussion

jarodwu

Apr 20, 2023

i had met this problem for server times,what's wrong with it?

wanghao-007

Apr 20, 2023

I have the same problem, and I don't know how to solve it. someone help me?

jarodwu

Apr 20, 2023

not yet.

TheBloke

Apr 20, 2023

This is model v1.0. There is already a newer model, v1.1. And one of the things it fixes is the added_tokens, which might be the problem you're having here.

So unless there's a particular reason why you want 1.0 instead of 1.1, I would try: https://huggingface.co/lmsys/vicuna-13b-delta-v1.1

Or I have already merged deltas for v1.1 and uploaded them in HF format, so you could use those instead. Then you wouldn't need to merge the deltas yourselves: https://huggingface.co/TheBloke/vicuna-13B-1.1-HF

wanghao-007

Apr 20, 2023

I have this problem。
OSError: Can't load tokenizer for 'pretrain_models/vicuna-7b-delta-v1.1/'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local
directory with the same name. Otherwise, make sure 'pretrain_models/vicuna-7b-delta-v1.1/' is the correct path to a directory containing all relevant files for a LlamaTokenizer
tokenizer.
I don't know how to solve it.

TheBloke

Apr 20, 2023

Again, you could just use https://huggingface.co/TheBloke/vicuna-13B-1.1-HF where it is already merged and you don't need to merge the deltas yourself.

I don't know why you're getting those errors. It worked fine when I ran it a few days ago. I expect something isn't set up right. Did you download Llama-13B-HF to merge the deltas on to?

lmzheng

Large Model Systems Organization org Apr 20, 2023

It seems you were using a newer version of fschat with these old weights.
Please checkout the version compatibility here https://github.com/lm-sys/FastChat/blob/main/docs/weights_version.md
We suggest you use the newer v1.1 weights.
Close this for now. Feel free to reopen.

lmzheng changed discussion status to closed Apr 20, 2023

wanghao-007

Apr 20, 2023

Thank you. I updated this FastChat and it worked.
Then I get the MiniGPT-4 checkpoint. Then when I run demo.py, it tells me:
RuntimeError: Error(s) in loading state_dict for MiniGPT4:
size mismatch for llama_proj.weight: copying a param with shape
torch.Size([5120, 768]) from checkpoint, the shape in current model is
torch.Size([4096, 768]).
size mismatch for llama_proj.bias: copying a param with shape
torch.Size([5120]) from checkpoint, the shape in current model is
torch.Size([4096]).
I wonder where the dimension error occurred.

jarodwu

Apr 21, 2023

This is model v1.0. There is already a newer model, v1.1. And one of the things it fixes is the added_tokens, which might be the problem you're having here.

So unless there's a particular reason why you want 1.0 instead of 1.1, I would try: https://huggingface.co/lmsys/vicuna-13b-delta-v1.1

Or I have already merged deltas for v1.1 and uploaded them in HF format, so you could use those instead. Then you wouldn't need to merge the deltas yourselves: https://huggingface.co/TheBloke/vicuna-13B-1.1-HF

thank you dude,i you're right we'd better use v1.1,once again,thank you

jarodwu

Apr 21, 2023

It seems you were using a newer version of fschat with these old weights.
Please checkout the version compatibility here https://github.com/lm-sys/FastChat/blob/main/docs/weights_version.md
We suggest you use the newer v1.1 weights.
Close this for now. Feel free to reopen.

yes, i had made it by your advice. and i would say vicuna is a masterpiece,thanks for sharing.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment