Inconsistency between model name and the model architecture from config.json

#2
by jiang784 - opened

Hello,

I was recently browsing roberta models on HuggingFace and I came across this model: "uer/chinese_roberta_L-12_H-768". I noticed a discrepancy that I'd like to bring to your attention.

According to the model's name, it's supposed to be a RoBERTa model. However, in the config.json file, the model_type and architecture are labeled as 'BERT', not 'RoBERTa'. I also looked at the architecture as represented in the code, and it does indeed look like it’s a BERT model.

Here are the pertinent details:

  • Model name: uer/chinese_roberta_L-12_H-768
  • model_type in config.json: BERT
  • architectures in config.json: BERTForMaskedLM

This discrepancy could potentially lead to confusion when users are trying to understand the underlying architecture of this model.

I wanted to bring this to your attention in case it was an oversight. If it's not an oversight and there's a specific reason for this labeling, I'd appreciate it if you could clarify.

Thank you for your time and the work you've put into developing this model. I look forward to your response.

Sign up or log in to comment