## This is roberta-base not distilled #4

Hi

I found that the size of this model is the same as xlm-roberta-base model.

What do you mean by "distilled" in the name of this model?

This model was distilled from the deepset/xlm-roberta-large-squad2 model.

The number of params is the same as xlm-roberta!

Both has 278,084,405.

Where on the other hand, monolingual Roberta and its distilled:

RoBERTa: 124,686,389

DistRoBERTa: 82,159,157

Am I missing something here?

Hi, @bilalghanem I believe there is some confusion.

This model was distilled from the xlm-roberta-**large**-squad2 which is the same size as xlm-roberta-**large** model:

xlm-roberta-large-squad2: pytorch weights are 2.24 GB link

xlm-roberta-large: pytorch weights are 2.24 GB link

So this model (xlm-roberta-**base**-squad2-distilled) was made by distilling the **large** model into a **base** model so we expect this model (xlm-roberta-**base**-squad2-distilled) to be the same size as xlm-roberta-**base** model.

xlm-roberta-base-squad2-distilled: pytorch weights are 1.11 GB link

xlm-roberta-base-squad2: pytorch weights are 1.11 GB link

xlm-roberta-base: pytorch weights are 1.12 GB link

Let me know if this answers your question and you can learn more about model distillation on our blog: Model Distillation with Haystack

I see, thanks for the clarification!

Then I'd suggest you to name it `deepset/xlm-roberta-large-squad2-distilled`

, not base.

Because the way the models are named in huggingface is different, e.g. `distilbert-base-uncased`

is a distilled model of `bert-base-uncased`

, not `bert-large-uncased.

Thanks!