mapama247 commited on
Commit
beeb241
1 Parent(s): b1b3bdc

update readme

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -30,9 +30,9 @@ pipeline_tag: fill-mask
30
 
31
  This model is a distilled version of [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2). It follows the same training procedure as [DistilBERT](https://arxiv.org/abs/1910.01108), using the implementation of Knowledge Distillation from the paper's [official repository](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation).
32
 
33
- The resulting architecture consists of 6 layers, 768 dimensional embeddings and 12 attention heads. This adds up to a total of 82M parameters, which is considerably less than the 125M of standard RoBERTa-base models. This makes the model lighter and faster than the original, at the cost of a slightly lower performance.
34
 
35
- We encourage users of this model to check out the [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2) model card to learn more details about the teacher model, as well as the training and evaluation data.
36
 
37
  ## Intended uses and limitations
38
 
@@ -44,7 +44,7 @@ Usage example where the model is passed to a fill-mask pipeline to predict the m
44
  ```python
45
  from pprint import pprint
46
  from transformers import pipeline
47
- pipe = pipeline("fill-mask", model="projecte-aina/distilroberta-base-ca")
48
  text = "El <mask> és el meu dia preferit de la setmana."
49
  pprint(pipe(text))
50
  ```
 
30
 
31
  This model is a distilled version of [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2). It follows the same training procedure as [DistilBERT](https://arxiv.org/abs/1910.01108), using the implementation of Knowledge Distillation from the paper's [official repository](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation).
32
 
33
+ The resulting architecture consists of 6 layers, 768 dimensional embeddings and 12 attention heads. This adds up to a total of 82M parameters, which is considerably less than the 125M of standard RoBERTa-base models. This makes the model lighter and faster than the original, at the cost of slightly lower performance.
34
 
35
+ We encourage users of this model to check out the [projecte-aina/roberta-base-ca-v2](https://huggingface.co/projecte-aina/roberta-base-ca-v2) model card to learn more details about the teacher model.
36
 
37
  ## Intended uses and limitations
38
 
 
44
  ```python
45
  from pprint import pprint
46
  from transformers import pipeline
47
+ pipe = pipeline("fill-mask", model="projecte-aina/distilroberta-base-ca-v2")
48
  text = "El <mask> és el meu dia preferit de la setmana."
49
  pprint(pipe(text))
50
  ```