Helsinki-NLP/opus-tatoeba-en-ja · Fix weights by putting the right value in `lm

Jun 20, 2023

There was probably a bug in the initial conversion script that created those models, as the weights they have have a
different value for lm_head.weight and model.decoder.embed_tokens.weight. Those models are tied though.

This was not a problem until now as the model was tied after the load and the (wrong) value of lm_head.weight was
replaced by the value of model.decoder.embed_tokens.weight. This does not work any more if we tie the weights before
the load however, as the value picked might be the one from lm_head.weight depending on how the models are tied.
As far as I can see, the model stop generating properly on Transformers main.

This should fix the bug without any side effect.

Fix weights by putting the right value in `lm_head.weight`00ab68db

ArthurZ

Language Technology Research Group at the University of Helsinki org Sep 18, 2023

Thanks I can't merge but this fixes the issues for these models

droussis

Sep 18, 2023

Thank you very much. I'm a bit confused though.
I want to convert a Marian MT model (from Tatoeba-Challenge) to PyTorch, so as to use it with HF locally.
In order to apply this fix, should I make changes to the MarianMTModel or in the conversion script as well?

ArthurZ

Language Technology Research Group at the University of Helsinki org Sep 19, 2023

If you use the latest release of transformers, the conversion should work out of the box! Does it not?

tiedeman changed pull request status to merged Sep 19, 2023