Dani commited on
Commit
ef9fd15
1 Parent(s): 255b995

fixed to use the MaskedLM version of model

Browse files
Files changed (2) hide show
  1. README.md +4 -8
  2. pytorch_model.bin +2 -2
README.md CHANGED
@@ -4,14 +4,15 @@ license: apache-2.0
4
  datasets:
5
  - wikipedia
6
  widget:
7
- - text: "El español es un idioma muy [MASK] en el mundo."
8
  ---
9
 
10
  # DistilBERT base multilingual model Spanish subset (cased)
11
 
12
- This model is the Spanish extract of `distilbert-base-multilingual-cased`, a distilled version of the [BERT base multilingual model](bert-base-multilingual-cased). It uses the extraction method proposed by Geotrend, which is described in https://github.com/Geotrend-research/smaller-transformers.
13
 
14
- In particular, we've ran the following script:
 
15
 
16
  ```sh
17
  python reduce_model.py \
@@ -24,8 +25,3 @@ python reduce_model.py \
24
  The resulting model has the same architecture as DistilmBERT: 6 layers, 768 dimension and 12 heads, with a total of **65M parameters** (compared to 134M parameters for DistilmBERT).
25
 
26
  The goal of this model is to reduce even further the size of the `distilbert-base-multilingual` multilingual model by selecting only most frequent tokens for Spanish, reducing the size of the embedding layer. For more details visit the paper from the Geotrend team: Load What You Need: Smaller Versions of Multilingual BERT.
27
-
28
-
29
-
30
-
31
-
 
4
  datasets:
5
  - wikipedia
6
  widget:
7
+ - text: "El español es un idioma muy [MASK] en el mundo."
8
  ---
9
 
10
  # DistilBERT base multilingual model Spanish subset (cased)
11
 
12
+ This model is the Spanish extract of `distilbert-base-multilingual-cased` (https://huggingface.co/distilbert-base-multilingual-cased), a distilled version of the [BERT base multilingual model](bert-base-multilingual-cased). This model is cased: it does make a difference between english and English.
13
 
14
+ It uses the extraction method proposed by Geotrend, which is described in https://github.com/Geotrend-research/smaller-transformers.
15
+ Specifically, we've ran the following script:
16
 
17
  ```sh
18
  python reduce_model.py \
 
25
  The resulting model has the same architecture as DistilmBERT: 6 layers, 768 dimension and 12 heads, with a total of **65M parameters** (compared to 134M parameters for DistilmBERT).
26
 
27
  The goal of this model is to reduce even further the size of the `distilbert-base-multilingual` multilingual model by selecting only most frequent tokens for Spanish, reducing the size of the embedding layer. For more details visit the paper from the Geotrend team: Load What You Need: Smaller Versions of Multilingual BERT.
 
 
 
 
 
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bd5f52e52f96ffb08ab544a267ebf536ae1a5a8eccba8e3d079d3a9ed9254265
3
- size 252661335
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a7e9034002f6027c9c3e2644bf743b008fc7081072839124abd6673e6740c5c
3
+ size 255139145