When can the exllamav2 be supported?

#4
by rjmehta - opened

Getting this error when loading in exllamav2.

!! Warning, unknown architecture: ['MixtralForCausalLM']
!! Loading as LlamaForCausalLM

ValueError Traceback (most recent call last)
Cell In[1], line 30
28 config = ExLlamaV2Config()
29 config.model_dir = model_directory
---> 30 config.prepare()
31 model = ExLlamaV2(config)
32 #config.max_position_embeddings = 4096
33 #config.max_seq_len = 4096
34 #model.max_position_embeddings = 4096
35 #model.max_seq_len = 4096
154 break
155 else:
--> 156 raise ValueError(f" ## Could not find {prefix}.* in model")
158 # Model dimensions
160 self.head_dim = self.hidden_size // self.num_attention_heads

ValueError: ## Could not find model.layers.0.mlp.down_proj.* in model

Okay. Turbo is working on mixtral architecture. Wrong thread to question. Thanks anyways.

rjmehta changed discussion status to closed

@rjmehta You need to build exllamav2 if you will with cuda12.1 locally on this branch https://github.com/turboderp/exllamav2/tree/experimental

Sign up or log in to comment