size mismatch for the model

#2
by RoARene317 - opened

size mismatch for model.layers.29.self_attn.k_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.self_attn.o_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.29.self_attn.o_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.self_attn.q_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.29.self_attn.q_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.self_attn.v_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.29.self_attn.v_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.mlp.down_proj.qzeros: copying a param with shape torch.Size([344, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.29.mlp.down_proj.scales: copying a param with shape torch.Size([344, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.29.mlp.gate_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.29.mlp.gate_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.29.mlp.up_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.29.mlp.up_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.30.self_attn.k_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.self_attn.k_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.self_attn.o_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.self_attn.o_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.self_attn.q_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.self_attn.q_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.self_attn.v_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.self_attn.v_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.mlp.down_proj.qzeros: copying a param with shape torch.Size([344, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.30.mlp.down_proj.scales: copying a param with shape torch.Size([344, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.30.mlp.gate_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.30.mlp.gate_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.30.mlp.up_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.30.mlp.up_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.31.self_attn.k_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.self_attn.k_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.self_attn.o_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.self_attn.o_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.self_attn.q_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.self_attn.q_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.self_attn.v_proj.qzeros: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.self_attn.v_proj.scales: copying a param with shape torch.Size([128, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.mlp.down_proj.qzeros: copying a param with shape torch.Size([344, 512]) from checkpoint, the shape in current model is torch.Size([1, 512]).
size mismatch for model.layers.31.mlp.down_proj.scales: copying a param with shape torch.Size([344, 4096]) from checkpoint, the shape in current model is torch.Size([1, 4096]).
size mismatch for model.layers.31.mlp.gate_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.31.mlp.gate_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).
size mismatch for model.layers.31.mlp.up_proj.qzeros: copying a param with shape torch.Size([128, 1376]) from checkpoint, the shape in current model is torch.Size([1, 1376]).
size mismatch for model.layers.31.mlp.up_proj.scales: copying a param with shape torch.Size([128, 11008]) from checkpoint, the shape in current model is torch.Size([1, 11008]).

I got this error, any help?

make sure to load it as a 32 large group size model, you might've set whatever script you had there to load it as a 128g model.

RoARene317 changed discussion status to closed

Sign up or log in to comment