LM Head

#1
by VityaVitalich - opened

Dear maintainer,

Thank you for your work on converting to actual HF format from Flores. I was trying to do the same thing, however i came across the problem of LM head of the model. I have not found it inside the model state dict published by authors. Where did you take it?

Hello!

Could you please explain more about the 'LM head'? As I recall, there were no issues with converting the model that were related to it. Flores and M2M100 share the same architecture, so you can use the existing conversion script for it.

For me, the hardest part was related to the vocabulary and language tokens, which required manual adjustments to the files. For your information, I used the following script and source for the conversion:

Oh, I really missed that part at the HF repo. By the LM Head i mean the last layer that is trainable in most of the LMs and not connected with embeddings. I knew that there are models that use transposed embedding matrix as the LM Head but was not sure if it is the case here. Thank you so much for providing the script!

Maybe then you are aware if the Flores615 and 175M use the same tokenizer? If not, could you please share the script that converted tokenizers from checkpoint to hf? I actually need to test both models and your scripts and advices are very helpful!

Sign up or log in to comment