runtime error

ers.8.self_attn.o_proj.bias', 'model.layers.8.self_attn.q_proj.bias', 'model.layers.8.self_attn.v_proj.bias', 'model.layers.9.mlp.down_proj.bias', 'model.layers.9.mlp.gate_proj.bias', 'model.layers.9.mlp.up_proj.bias', 'model.layers.9.self_attn.k_proj.bias', 'model.layers.9.self_attn.o_proj.bias', 'model.layers.9.self_attn.q_proj.bias', 'model.layers.9.self_attn.v_proj.bias'] - This IS expected if you are initializing LlamaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing LlamaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Traceback (most recent call last): File "/home/user/app/app.py", line 17, in <module> model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="cpu") File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3907, in from_pretrained hf_*********.postprocess_model(model) File "/usr/local/lib/python3.10/site-packages/transformers/quantizers/base.py", line 195, in postprocess_model return self._process_model_after_weight_loading(model, **kwargs) File "/usr/local/lib/python3.10/site-packages/transformers/quantizers/quantizer_gptq.py", line 80, in _process_model_after_weight_loading model = self.optimum_quantizer.post_init_model(model) File "/usr/local/lib/python3.10/site-packages/optimum/gptq/quantizer.py", line 595, in post_init_model raise ValueError( ValueError: Found modules on cpu/disk. Using Exllama or Exllamav2 backend requires all the modules to be on GPU.You can deactivate exllama backend by setting `disable_exllama=True` in the quantization config object

Container logs:

Fetching error logs...