zemerov commited on
Commit
7f770c3
1 Parent(s): f91371a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -11
README.md CHANGED
@@ -65,17 +65,18 @@ Model was trained using 8xA100 for ~22 days.
65
 
66
  Standard RoBERTa-base parameters:
67
 
68
- | Argument | Value |
69
- |-------------------------|-------|
70
- |Activation function | gelu |
71
- |Attention dropout | 0.1 |
72
- |Dropout | 0.1 |
73
- |Encoder attention heads | 12 |
74
- |Encoder embed dim | 768 |
75
- |Encoder ffn embed dim | 3,072 |
76
- |Encoder layers | 12 |
77
- |Max positions | 512 |
78
- |Vocab size | 50266 |
 
79
 
80
  ## Evaluation
81
 
 
65
 
66
  Standard RoBERTa-base parameters:
67
 
68
+ | Argument | Value |
69
+ |-------------------------|----------------|
70
+ |Activation function | gelu |
71
+ |Attention dropout | 0.1 |
72
+ |Dropout | 0.1 |
73
+ |Encoder attention heads | 12 |
74
+ |Encoder embed dim | 768 |
75
+ |Encoder ffn embed dim | 3,072 |
76
+ |Encoder layers | 12 |
77
+ |Max positions | 512 |
78
+ |Vocab size | 50266 |
79
+ |Tokenizer type | Bete-level BPE |
80
 
81
  ## Evaluation
82