can't run with fastchat cuda 12.1

#1
by jaywanghz - opened

I can't not run with bot GQPTQ and AWQ models of 14b.
Both shows error as below:
2024-02-19 19:46:11 | ERROR | stderr | File "G:\ProgramData\Anaconda3\envs\chatchat210\Lib\site-packages\fastchat\model\model_adapter.py", line 281, in load_model
2024-02-19 19:46:11 | ERROR | stderr | model, tokenizer = adapter.load_compress_model(
2024-02-19 19:46:11 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-02-19 19:46:11 | ERROR | stderr | File "G:\ProgramData\Anaconda3\envs\chatchat210\Lib\site-packages\fastchat\model\model_adapter.py", line 115, in load_compress_model
2024-02-19 19:46:11 | ERROR | stderr | return load_compress_model(
2024-02-19 19:46:11 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^
2024-02-19 19:46:11 | ERROR | stderr | File "G:\ProgramData\Anaconda3\envs\chatchat210\Lib\site-packages\fastchat\model\compression.py", line 216, in load_compress_model
2024-02-19 19:46:11 | ERROR | stderr | apply_compressed_weight(model, compressed_state_dict, device)
2024-02-19 19:46:11 | ERROR | stderr | File "G:\ProgramData\Anaconda3\envs\chatchat210\Lib\site-packages\fastchat\model\compression.py", line 104, in apply_compressed_weight
2024-02-19 19:46:11 | ERROR | stderr | apply_compressed_weight(
2024-02-19 19:46:11 | ERROR | stderr | File "G:\ProgramData\Anaconda3\envs\chatchat210\Lib\site-packages\fastchat\model\compression.py", line 104, in apply_compressed_weight
2024-02-19 19:46:11 | ERROR | stderr | apply_compressed_weight(
2024-02-19 19:46:11 | ERROR | stderr | File "G:\ProgramData\Anaconda3\envs\chatchat210\Lib\site-packages\fastchat\model\compression.py", line 104, in apply_compressed_weight
2024-02-19 19:46:11 | ERROR | stderr | apply_compressed_weight(
2024-02-19 19:46:11 | ERROR | stderr | [Previous line repeated 1 more time]
2024-02-19 19:46:11 | ERROR | stderr | File "G:\ProgramData\Anaconda3\envs\chatchat210\Lib\site-packages\fastchat\model\compression.py", line 99, in apply_compressed_weight
2024-02-19 19:46:11 | ERROR | stderr | compressed_state_dict[full_name], target_attr.bias, target_device
2024-02-19 19:46:11 | ERROR | stderr | ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
2024-02-19 19:46:11 | ERROR | stderr | KeyError: 'model.layers.0.self_attn.k_proj.weight'

i understand the int4 files weights should be qweight not weight but just don't know how to solve the problem. Let me know the solution if there is, thanks.

Sign up or log in to comment