Model weights available from Hugging face are twice in embedding dimension when trying to load it

#7
by Samartha27 - opened

model = DonutModel.from_pretrained("naver-clova-ix/donut-base-finetuned-cord-v2")

'''
model = DonutModel.from_pretrained("naver-clova-ix/donut-base-finetuned-cord-v2")
File "/home/sam/code/donut/donut/model.py", line 597, in from_pretrained
model = super(DonutModel, cls).from_pretrained(pretrained_model_name_or_path, revision="official", *model_args, **kwargs)
File "/home/sam/anaconda3/envs/donut_official/lib/python3.7/site-packages/transformers/modeling_utils.py", line 2896, in from_pretrained
keep_in_fp32_modules=keep_in_fp32_modules,
File "/home/sam/anaconda3/envs/donut_official/lib/python3.7/site-packages/transformers/modeling_utils.py", line 3278, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for DonutModel:
size mismatch for encoder.model.layers.1.downsample.norm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for encoder.model.layers.1.downsample.norm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([512]).
size mismatch for encoder.model.layers.1.downsample.reduction.weight: copying a param with shape torch.Size([512, 1024]) from checkpoint, the shape in current model is torch.Size([256, 512]).
size mismatch for encoder.model.layers.2.downsample.norm.weight: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.model.layers.2.downsample.norm.bias: copying a param with shape torch.Size([2048]) from checkpoint, the shape in current model is torch.Size([1024]).
size mismatch for encoder.model.layers.2.downsample.reduction.weight: copying a param with shape torch.Size([1024, 2048]) from checkpoint, the shape in current model is torch.Size([512, 1024]).
You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method. '''

Sign up or log in to comment