studio-ousia/mluke-base · issue with from

Jan 11, 2023

Whenever I try to load any of your mLuke models with ###.from_pretrained("name_of_the_model") (I tried with ### as AutoModel, LukeModel, LukeForMaskedLM...), every single one of the model's weights get ignored, and so I end up with a randomly initialized model.
I tried on several machines and versions and always get this issue, so I suppose that it is a configuration file issue.

Thank you for any help you can offer !

ryo0634

Studio Ousia org Jan 11, 2023

•

edited Jan 11, 2023

Hi, thank you for showing an interest in our model.

I think that you are seeing a bunch of weights with the prefixes e2w_, e2e_, w2e_ are not loaded.
This is actually an expected behavior when use_entity_aware_attention is set to 'false' in the model config.
These weights are only used for entity aware attention, so those weights should be ignored if the model does not have entity_aware_attention.

The detail of entity aware attention is described in the monolingual LUKE paper (https://aclanthology.org/2020.emnlp-main.523/).
We've found entity aware attention does not improve performance in cross-lingual transfer settings, so turned it off by default.

hthomas

Jan 11, 2023

Hi,
Thank you very much for your quick response.

Indeed the weights which do not load are those related to entity-aware attention. I think that I had mistaken entity-aware attention for entity specific embeddings, which are two very different things.

Thank you for your work !

hthomas changed discussion status to closed Jan 11, 2023

studio-ousia
/

mluke-base

issue with from_pretrained