distilbert
/

distilbert-base-multilingual-cased

Inference Endpoints

Model card Files Files and versions Community

bourdoiscatie commited on Aug 9, 2022

Commit

e4d74d3

•

1 Parent(s): b3e1791

Update README.md

Update distillation link

Files changed (1) hide show

README.md +1 -2

README.md CHANGED Viewed

@@ -126,8 +126,7 @@ datasets:
 ## Model Description
-This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found
-[here](https://github.com/huggingface/transformers/tree/master/examples/distillation). This model is cased: it does make a difference between english and English.
 The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).
 The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).

 ## Model Description
+This model is a distilled version of the [BERT base multilingual model](https://huggingface.co/bert-base-multilingual-cased/). The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is cased: it does make a difference between english and English.
 The model is trained on the concatenation of Wikipedia in 104 different languages listed [here](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages).
 The model has 6 layers, 768 dimension and 12 heads, totalizing 134M parameters (compared to 177M parameters for mBERT-base).