BartekK
/

distilHerBERT-base-cased

Inference Endpoints

Model card Files Files and versions Community

BartekK commited on Sep 5, 2022

Commit

7276461

•

1 Parent(s): 8e2910c

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -7,7 +7,8 @@ tags:
 ## distilHerBERT
 distilHerBERT-base is a BERT-based Language Model trained on Polish subset of [cc100](https://huggingface.co/datasets/cc100) dataset using Masked Language Modelling (MLM) and [distillation procedure](https://arxiv.org/abs/1910.01108) from model [HerBERT](https://huggingface.co/allegro/herbert-base-cased) with dynamic masking of whole words.
 We provide one of the models (S4) described in the report from final project on the subject of (Deep) Natural Language Processing, which was carried out at MIMUW in 2021/2022: [Distillation_of_HerBERT](https://github.com/BartekKrzepkowski/DistilHerBERT-base_vol2/blob/master/report/Final_Report___Distillation_of_HerBERT.pdf).
-The model was trained using fp16 and the data parallelism method (ZeRO Stage 2), using the a deep learning optimization library - DeepSpeed.
 Model training and experiments were conducted with transformers in version 4.20.1.

 ## distilHerBERT
 distilHerBERT-base is a BERT-based Language Model trained on Polish subset of [cc100](https://huggingface.co/datasets/cc100) dataset using Masked Language Modelling (MLM) and [distillation procedure](https://arxiv.org/abs/1910.01108) from model [HerBERT](https://huggingface.co/allegro/herbert-base-cased) with dynamic masking of whole words.
 We provide one of the models (S4) described in the report from final project on the subject of (Deep) Natural Language Processing, which was carried out at MIMUW in 2021/2022: [Distillation_of_HerBERT](https://github.com/BartekKrzepkowski/DistilHerBERT-base_vol2/blob/master/report/Final_Report___Distillation_of_HerBERT.pdf).
+The model was trained using fp16 and the data parallelism method (ZeRO Stage 2), using the deep learning optimization library - DeepSpeed.
 Model training and experiments were conducted with transformers in version 4.20.1.