microsoft/Multilingual-MiniLM-L12-H384 · Discrepancy in Parameter Count: A Closer Look at the Model's Size and the Number of Layers

Jul 26, 2023

•

edited Jul 26, 2023

How are you claiming that your model has only 21M parameters when I verified it and found that it actually has over 117M parameters and the Number of Layers: 199 not 12?


from transformers import AutoModel

model_name = "microsoft/Multilingual-MiniLM-L12-H384" # Total Trainable Parameters: 117653760 Not 21M
model = AutoModel.from_pretrained(model_name)

n_layers = len([f for f in model.parameters()])
n_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print("Number of Layers:", n_layers)
print("Total Trainable Parameters:", n_params)

# Number of Layers: 199 not 12
# Total Trainable Parameters: 117653760 not 21M

Karim-Gamal changed discussion title from Discrepancy in Parameter Count: A Closer Look at the Model's Size to Discrepancy in Parameter Count: A Closer Look at the Model's Size and the Number of Layers Jul 26, 2023

unilm

Microsoft org Jul 26, 2023

How are you claiming that your model has only 21M parameters when I verified it and found that it actually has over 117M parameters and the Number of Layers: 199 not 12?


from transformers import AutoModel

model_name = "microsoft/Multilingual-MiniLM-L12-H384" # Total Trainable Parameters: 117653760 Not 21M
model = AutoModel.from_pretrained(model_name)

n_layers = len([f for f in model.parameters()])
n_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print("Number of Layers:", n_layers)
print("Total Trainable Parameters:", n_params)

# Number of Layers: 199 not 12
# Total Trainable Parameters: 117653760 not 21M

@Karim-Gamal The Transformer encoder's parameter count is 21M, without the word embeddings.

unilm changed discussion status to closed Jul 26, 2023

Karim-Gamal

Jul 27, 2023

I got your point, thanks!