Text Generation
Transformers
PyTorch
English
llama
causal-lm
text-generation-inference
Inference Endpoints

How to combine weights?

#4
by joaomoreno - opened

The blog post mentions:

Once you have both the weight delta and the LLaMA weights, you can use a script provided in the GitHub repo to combine them and obtain StableVicuna-13B.

What repo is this? Where is the script?

And why don't they provide directly the full weight???

The script to do the delta merge is linked in this model's README, as well as the instructions for doing it.

But I've already done it here: https://huggingface.co/TheBloke/stable-vicuna-13B-HF

So you can just use my merge if you want.

Yes thanks, I am using wour combi ed weight ;)

I get the following error:
OSError: Unable to load weights from pytorch checkpoint file,
gonna try Blokes solution...

@ALL

I am collecting llama tools, just succeeded in combing this model with original weights provided from https://huggingface.co/decapoda-research/llama-13b-hf/tree/main

Just one caveat thing

CAUTION : you need to replace LLaMATokenier in tokenizer config json into LlamaTokenizer in the original weight repo

A 13B model combo requires about 70GB CPU memory

Here is my llama tool repot : https://gitee.com/yhyu13/llama_-tools

Sign up or log in to comment