Could you provide Usage Example:

#1
by simsim314 - opened

I am failing to make it run.

After installing exllamav2

!pip install https://github.com/turboderp/exllamav2/releases/download/v0.0.16/exllamav2-0.0.16+cu118-cp310-cp310-linux_x86_64.whl

And downloading the repository

huggingface-cli download turboderp/dbrx-instruct-exl2 --revision "2.75bpw" --local-dir dbrx_275

I was trying to run examples/chat.py:

python examples/chat.py -m "dbrx_275" -mode raw --gpu_split auto

Getting this error:

-- Model: dbrx_275
 -- Options: ['gpu_split: auto']
 !! Warning, unknown architecture: DbrxForCausalLM
 !! Loading as LlamaForCausalLM
Traceback (most recent call last):
  File "/workspace/exllamav2/examples/chat.py", line 93, in <module>
    model, tokenizer = model_init.init(args, allow_auto_split = True, max_output_len = 16)
  File "/usr/local/lib/python3.10/dist-packages/exllamav2/model_init.py", line 82, in init
    config.prepare()
  File "/usr/local/lib/python3.10/dist-packages/exllamav2/config.py", line 100, in prepare
    self.num_hidden_layers = read_config["num_hidden_layers"]
KeyError: 'num_hidden_layers'

Please advice how to run this model.

you'll need a newer version of exl2, dbrx is only supported on master, not any released versions

simsim314 changed discussion status to closed

Sign up or log in to comment