Recognai
/

distilbert-base-es-multilingual-cased

@@ -4,14 +4,15 @@ license: apache-2.0
 datasets:
 - wikipedia
 widget:
-- text: "El español es un idioma muy [MASK] en el mundo."
 ---
 # DistilBERT base multilingual model Spanish subset (cased)
-This model is the Spanish extract of `distilbert-base-multilingual-cased`, a distilled version of the [BERT base multilingual model](bert-base-multilingual-cased). It uses the extraction method proposed by Geotrend, which is described in https://github.com/Geotrend-research/smaller-transformers.
-In particular, we've ran the following script:
 ```sh
 python reduce_model.py \
@@ -24,8 +25,3 @@ python reduce_model.py \
 The resulting model has the same architecture as DistilmBERT: 6 layers, 768 dimension and 12 heads, with a total of **65M parameters** (compared to 134M parameters for DistilmBERT).
 The goal of this model is to reduce even further the size of the `distilbert-base-multilingual` multilingual model by selecting only most frequent tokens for Spanish, reducing the size of the embedding layer. For more details visit the paper from the Geotrend team: Load What You Need: Smaller Versions of Multilingual BERT.

 datasets:
 - wikipedia
 widget:
+- text: "El espaÃ±ol es un idioma muy [MASK] en el mundo."
 ---
 # DistilBERT base multilingual model Spanish subset (cased)
+This model is the Spanish extract of `distilbert-base-multilingual-cased` (https://huggingface.co/distilbert-base-multilingual-cased), a distilled version of the [BERT base multilingual model](bert-base-multilingual-cased). This model is cased: it does make a difference between english and English.
+It uses the extraction method proposed by Geotrend, which is described in https://github.com/Geotrend-research/smaller-transformers.
+Specifically, we've ran the following script:
 ```sh
 python reduce_model.py \
 The resulting model has the same architecture as DistilmBERT: 6 layers, 768 dimension and 12 heads, with a total of **65M parameters** (compared to 134M parameters for DistilmBERT).
 The goal of this model is to reduce even further the size of the `distilbert-base-multilingual` multilingual model by selecting only most frequent tokens for Spanish, reducing the size of the embedding layer. For more details visit the paper from the Geotrend team: Load What You Need: Smaller Versions of Multilingual BERT.

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bd5f52e52f96ffb08ab544a267ebf536ae1a5a8eccba8e3d079d3a9ed9254265
-size 252661335

 version https://git-lfs.github.com/spec/v1
+oid sha256:0a7e9034002f6027c9c3e2644bf743b008fc7081072839124abd6673e6740c5c
+size 255139145