update readme
Browse files
README.md
CHANGED
@@ -30,9 +30,9 @@ pipeline_tag: fill-mask
|
|
30 |
|
31 |
This model is a distilled version of [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2). It follows the same training procedure as [DistilBERT](https://arxiv.org/abs/1910.01108), using the implementation of Knowledge Distillation from the paper's [official repository](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation).
|
32 |
|
33 |
-
The resulting architecture consists of 6 layers, 768 dimensional embeddings and 12 attention heads. This adds up to a total of 82M parameters, which is considerably less than the 125M of standard RoBERTa-base models. This makes the model lighter and faster than the original, at the cost of
|
34 |
|
35 |
-
We encourage users of this model to check out the [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2) model card to learn more details about the teacher model
|
36 |
|
37 |
## Intended uses and limitations
|
38 |
|
@@ -44,7 +44,7 @@ Usage example where the model is passed to a fill-mask pipeline to predict the m
|
|
44 |
```python
|
45 |
from pprint import pprint
|
46 |
from transformers import pipeline
|
47 |
-
pipe = pipeline("fill-mask", model="projecte-aina/distilroberta-base-ca")
|
48 |
text = "El <mask> és el meu dia preferit de la setmana."
|
49 |
pprint(pipe(text))
|
50 |
```
|
|
|
30 |
|
31 |
This model is a distilled version of [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2). It follows the same training procedure as [DistilBERT](https://arxiv.org/abs/1910.01108), using the implementation of Knowledge Distillation from the paper's [official repository](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation).
|
32 |
|
33 |
+
The resulting architecture consists of 6 layers, 768 dimensional embeddings and 12 attention heads. This adds up to a total of 82M parameters, which is considerably less than the 125M of standard RoBERTa-base models. This makes the model lighter and faster than the original, at the cost of slightly lower performance.
|
34 |
|
35 |
+
We encourage users of this model to check out the [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2) model card to learn more details about the teacher model.
|
36 |
|
37 |
## Intended uses and limitations
|
38 |
|
|
|
44 |
```python
|
45 |
from pprint import pprint
|
46 |
from transformers import pipeline
|
47 |
+
pipe = pipeline("fill-mask", model="projecte-aina/distilroberta-base-ca-v2")
|
48 |
text = "El <mask> és el meu dia preferit de la setmana."
|
49 |
pprint(pipe(text))
|
50 |
```
|