Inconsistency between model name and the model architecture from config.json
Hello,
I was recently browsing nlln models on HuggingFace and I came across this model: "facebook/nllb-200-3.3B". I noticed a discrepancy that I'd like to bring to your attention.
According to the model's name, it's supposed to be a NLLB model. However, in the config.json file, the model_type and architecture are labeled as 'M2M', not 'NLLB'.
Here are the pertinent details:
Model name: facebook/nllb-200-3.3B
model_type in config.json: m2m_100
architectures in config.json: M2M100ForConditionalGeneration
I noticed that Hugging Face does provide a label for NLLB models. This discrepancy could potentially lead to confusion when users are trying to understand the underlying architecture of this model.
I wanted to bring this to your attention in case it was an oversight. If it's not an oversight and there's a specific reason for this labeling, I'd appreciate it if you could clarify.
Thank you for your time and the work you've put into developing this model. I look forward to your response.
Hello
@jiang784
, the two models, M2M and NLLB, share the same underlying architecture, hence why this model has the m2m_100
model type in its configuration.
However, these two models do not share the same tokenizer; this is why we have added a new tokenizer in the transformers
library dedicated to NLLB: https://github.com/huggingface/transformers/pull/18126