YAML Metadata
Error:
"datasets[1]" with value "oscar (NL)" is not valid. If possible, use a dataset id from https://hf.co/datasets.
About RobBERTje
RobBERTje is a collection of distilled models based on RobBERT. There are multiple models with different sizes and different training settings, which you can choose for your use-case.
We are also continuously working on releasing better-performing models, so watch the repository for updates.
News
- February 21, 2022: Our paper about RobBERTje has been published in volume 11 of CLIN journal!
- July 2, 2021: Publicly released 4 RobBERTje models.
- May 12, 2021: RobBERTje was accepted at CLIN31 for an oral presentation!
The models
Model | Description | Parameters | Training size | Huggingface id |
---|---|---|---|---|
Non-shuffled | Trained on the non-shuffled variant of the oscar corpus, without any operations to preserve this order during training and distillation. | 74 M | 1 GB | DTAI-KULeuven/robbertje-1-gb-non-shuffled |
Shuffled | Trained on the publicly available and shuffled OSCAR corpus. | 74 M | 1 GB | this model |
Merged (p=0.5) | Same as the non-shuffled variant, but sequential sentences of the same document are merged with a probability of 50%. | 74 M | 1 GB | DTAI-KULeuven/robbertje-1-gb-merged |
BORT | A smaller version with 8 attention heads instead of 12 and 4 layers instead of 6 (and 12 for RobBERT). | 46 M | 1 GB | DTAI-KULeuven/robbertje-1-gb-bort |
Results
Intrinsic results
We calculated the pseudo perplexity (PPPL) from cite, which is a built-in metric in our distillation library. This metric gives an indication of how well the model captures the input distribution.
Model | PPPL |
---|---|
RobBERT (teacher) | 7.76 |
Non-shuffled | 12.95 |
Shuffled | 18.74 |
Merged (p=0.5) | 17.10 |
BORT | 26.44 |
Extrinsic results
We also evaluated our models on sereral downstream tasks, just like the teacher model RobBERT. Since that evaluation, a Dutch NLI task named SICK-NL was also released and we evaluated our models with it as well.
Model | DBRD | DIE-DAT | NER | POS | SICK-NL |
---|---|---|---|---|---|
RobBERT (teacher) | 94.4 | 99.2 | 89.1 | 96.4 | 84.2 |
Non-shuffled | 90.2 | 98.4 | 82.9 | 95.5 | 83.4 |
Shuffled | 92.5 | 98.2 | 82.7 | 95.6 | 83.4 |
Merged (p=0.5) | 92.9 | 96.5 | 81.8 | 95.2 | 82.8 |
BORT | 89.6 | 92.2 | 79.7 | 94.3 | 81.0 |
- Downloads last month
- 141
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.