Token Classification
Transformers
PyTorch
Bulgarian
bert
torch
rmihaylov commited on
Commit
e85ab91
1 Parent(s): eb118eb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -16,10 +16,12 @@ tags:
16
  Pretrained model on Bulgarian language using a masked language modeling (MLM) objective. It was introduced in
17
  [this paper](https://arxiv.org/abs/1810.04805) and first released in
18
  [this repository](https://github.com/google-research/bert). This model is cased: it does make a difference
19
- between bulgarian and Bulgarian.
20
 
21
  It was finetuned on public part-of-speech Bulgarian data.
22
 
 
 
23
  ### How to use
24
 
25
  Here is how to use this model in PyTorch:
 
16
  Pretrained model on Bulgarian language using a masked language modeling (MLM) objective. It was introduced in
17
  [this paper](https://arxiv.org/abs/1810.04805) and first released in
18
  [this repository](https://github.com/google-research/bert). This model is cased: it does make a difference
19
+ between bulgarian and Bulgarian. The training data is Bulgarian text from [OSCAR](https://oscar-corpus.com/post/oscar-2019/), [Chitanka](https://chitanka.info/) and [Wikipedia](https://bg.wikipedia.org/).
20
 
21
  It was finetuned on public part-of-speech Bulgarian data.
22
 
23
+ Then, it was compressed via [progressive module replacing](https://arxiv.org/abs/2002.02925).
24
+
25
  ### How to use
26
 
27
  Here is how to use this model in PyTorch: