mrm8488 commited on
Commit
26b62c7
1 Parent(s): 8f39606

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -45,18 +45,17 @@ It achieves the following results on the evaluation set:
45
  - Loss: 0.1116
46
  - Accuracy: 0.9823
47
 
48
- ## Model description
49
 
50
- More information needed
 
 
51
 
52
- ## Intended uses & limitations
53
-
54
- More information needed
55
 
56
  ## Training and evaluation data
57
 
58
- More information needed
59
-
60
  ## Training procedure
61
 
62
  ### Training hyperparameters
 
45
  - Loss: 0.1116
46
  - Accuracy: 0.9823
47
 
48
+ ## Base Model description
49
 
50
+ This model is a distilled version of the [RoBERTa-base model](https://huggingface.co/roberta-base). It follows the same training procedure as [DistilBERT](https://huggingface.co/distilbert-base-uncased).
51
+ The code for the distillation process can be found [here](https://github.com/huggingface/transformers/tree/master/examples/distillation).
52
+ This model is case-sensitive: it makes a difference between English and English.
53
 
54
+ The model has 6 layers, 768 dimension and 12 heads, totalizing 82M parameters (compared to 125M parameters for RoBERTa-base).
55
+ On average DistilRoBERTa is twice as fast as Roberta-base.
 
56
 
57
  ## Training and evaluation data
58
 
 
 
59
  ## Training procedure
60
 
61
  ### Training hyperparameters