invalid weights doesn't match modeling code

by winglian - opened Apr 1, 2024

Apr 1, 2024

https://huggingface.co/SinclairSchneider/dbrx-base-quantization-fixed/blob/main/modeling_dbrx.py#L754-L756

The modeling code this model references has split the expert weights, but this model isn't

 size mismatch for transformer.blocks.38.ffn.experts.mlp.9.v1.weight: copying a param with shape torch.Size([33030144, 1]) from checkpoint, the shape in current model is torch.Size([10752, 6144
]).

johnrachwanpruna

Pruna AI org Apr 1, 2024

This is a converted model from the original one that changes some architecture components in order to enable bnb quantization (see https://huggingface.co/databricks/dbrx-instruct/discussions/10#660921b553b869c928b0c5d0)

johnrachwanpruna changed discussion status to closed Apr 2, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment