Fix weights by putting the right value in `lm_head.weight`

#3
by sgugger - opened

There was probably a bug in the initial conversion script that created those models, as the weights they have have a
different value for lm_head.weight and model.decoder.embed_tokens.weight. Those models are tied though.

This was not a problem until now as the model was tied after the load and the (wrong) value of lm_head.weight was
replaced by the value of model.decoder.embed_tokens.weight. This does not work any more if we tie the weights before
the load however, as the value picked might be the one from lm_head.weight depending on how the models are tied.
As far as I can see, the model stop generating properly on Transformers main.

This should fix the bug without any side effect.

sgugger changed pull request status to merged

Sign up or log in to comment