Mismatch in lm_head.weight and model.embed_tokens.weight Layer Sizes

#2
by Danivilanova - opened

I am experiencing an issue with the size of the lm_head.weight and the model.embed_tokens.weight layers in the model. According to the vocab_size, both of these layers should have a size of torch.Size([32001, 8192]). However, when I load the checkpoint, I observe that these layers have a size of torch.Size([32000, 8192]).

Has anyone else encountered this issue, or does anyone have any insights or suggestions on how to resolve it? Any help would be greatly appreciated!

Sign up or log in to comment