BartekK commited on
Commit
7276461
1 Parent(s): 8e2910c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -7,7 +7,8 @@ tags:
7
  ## distilHerBERT
8
  distilHerBERT-base is a BERT-based Language Model trained on Polish subset of [cc100](https://huggingface.co/datasets/cc100) dataset using Masked Language Modelling (MLM) and [distillation procedure](https://arxiv.org/abs/1910.01108) from model [HerBERT](https://huggingface.co/allegro/herbert-base-cased) with dynamic masking of whole words.
9
  We provide one of the models (S4) described in the report from final project on the subject of (Deep) Natural Language Processing, which was carried out at MIMUW in 2021/2022: [Distillation_of_HerBERT](https://github.com/BartekKrzepkowski/DistilHerBERT-base_vol2/blob/master/report/Final_Report___Distillation_of_HerBERT.pdf).
10
- The model was trained using fp16 and the data parallelism method (ZeRO Stage 2), using the a deep learning optimization library - DeepSpeed.
 
11
 
12
  Model training and experiments were conducted with transformers in version 4.20.1.
13
 
7
  ## distilHerBERT
8
  distilHerBERT-base is a BERT-based Language Model trained on Polish subset of [cc100](https://huggingface.co/datasets/cc100) dataset using Masked Language Modelling (MLM) and [distillation procedure](https://arxiv.org/abs/1910.01108) from model [HerBERT](https://huggingface.co/allegro/herbert-base-cased) with dynamic masking of whole words.
9
  We provide one of the models (S4) described in the report from final project on the subject of (Deep) Natural Language Processing, which was carried out at MIMUW in 2021/2022: [Distillation_of_HerBERT](https://github.com/BartekKrzepkowski/DistilHerBERT-base_vol2/blob/master/report/Final_Report___Distillation_of_HerBERT.pdf).
10
+
11
+ The model was trained using fp16 and the data parallelism method (ZeRO Stage 2), using the deep learning optimization library - DeepSpeed.
12
 
13
  Model training and experiments were conducted with transformers in version 4.20.1.
14