TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ · When can the exllamav2 be supported?

Dec 15, 2023

•

edited Dec 15, 2023

Getting this error when loading in exllamav2.

!! Warning, unknown architecture: ['MixtralForCausalLM']
!! Loading as LlamaForCausalLM

ValueError Traceback (most recent call last)
Cell In[1], line 30
28 config = ExLlamaV2Config()
29 config.model_dir = model_directory
---> 30 config.prepare()
31 model = ExLlamaV2(config)
32 #config.max_position_embeddings = 4096
33 #config.max_seq_len = 4096
34 #model.max_position_embeddings = 4096
35 #model.max_seq_len = 4096
154 break
155 else:
--> 156 raise ValueError(f" ## Could not find {prefix}.* in model")
158 # Model dimensions
160 self.head_dim = self.hidden_size // self.num_attention_heads

ValueError: ## Could not find model.layers.0.mlp.down_proj.* in model

rjmehta

Dec 15, 2023

Okay. Turbo is working on mixtral architecture. Wrong thread to question. Thanks anyways.

rjmehta changed discussion status to closed Dec 15, 2023

rjmehta

Dec 15, 2023

https://github.com/turboderp/exllamav2/issues/223

Yhyu13

Dec 16, 2023

@rjmehta You need to build exllamav2 if you will with cuda12.1 locally on this branch https://github.com/turboderp/exllamav2/tree/experimental

When can the exllamav2 be supported?

!! Warning, unknown architecture: ['MixtralForCausalLM'] !! Loading as LlamaForCausalLM

!! Warning, unknown architecture: ['MixtralForCausalLM']
!! Loading as LlamaForCausalLM