Maybe some tokenizer files are missing?

by lpy86786 - opened Apr 13, 2023

Apr 13, 2023

I have downloaded the model and followed the instructions on https://github.com/lm-sys/FastChat and have gone through the models from lmsys/vicuna-13b-delta-v0. But it won't work for lmsys/vicuna-13b-delta-v1.1 until I add the following files from lmsys/vicuna-13b-delta-v0:

special_tokens_map.json
tokenizer.model
tokenizer_config.json

After that I got screens of messy code...I guess maybe the three correct corresponding files are missing?

FanGod

Apr 13, 2023

Yes, I have the same problem.

OSError: Can't load tokenizer for '/home/lianpengcheng/models/source_models/vicuna-13b-delta-v1.1/'

lmzheng

Large Model Systems Organization org Apr 13, 2023

Hi, the tokenizer files are omitted on purpose because we didn't change any tokenizer. The tokenizer will be the same as LLaMa's tokenizer.
For your problems, please install the latest version of FastChat and apply the delta again. There should be no errors.

michaelvll

Large Model Systems Organization org Apr 13, 2023

A reminder: you may want to use the llama model transformed using the latest huggingface transformers, as they recently updated the tokenizer. For example, this may not work, but this should work.

FanGod

Apr 13, 2023

•

edited Apr 13, 2023

Use LLaMa's tokenizer, but still error.

...
param.data += delta.state_dict()[name]
...

RuntimeError: The size of tensor a (32001) must match the size of tensor b (32000) at non-singleton dimension 0

tangles

Apr 13, 2023

use the latest apply_delta.py from the fastchat repo

FanGod

Apr 13, 2023

Thanks a lot, it works.

lpy86786

Apr 14, 2023

Thanks a lot. The tokenizer files from lmsys/vicuna-13b-delta-v0 have no problem and can be directly used.
Finally I found it was my mistake to omit the hint "NOTE: This "delta model" cannot be used directly.".
My problem has been addressed after using the models from https://huggingface.co/eachadea/vicuna-13b-1.1 .That model can be directly used.

liyang31163150

Apr 22, 2023

Can you provide the merged version（with llama version） instead of just the incremental version?

TheBloke

Apr 22, 2023

Can you provide the merged version（with llama version） instead of just the incremental version?

They can't provide a merged version due to the Llama license terms. But I've merged it and it's available here: https://huggingface.co/TheBloke/vicuna-13B-1.1-HF

ricardor1267

Apr 28, 2023

Help..

return self._apply(lambda t: t.cuda(device))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 136.00 MiB (GPU 0; 6.00 GiB total capacity; 5.27 GiB already allocated; 0 bytes free; 5.27 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

lmzheng changed discussion status to closed May 5, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment