Sabine
commited on
Commit
•
c722954
1
Parent(s):
75add59
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,33 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# random-roberta-tiny
|
2 |
+
|
3 |
+
We introduce random-roberta-tiny, which is a unpretrained version of a mini RoBERTa model(2 layer and 128 heads). The weight of random-roberta-tiny is randomly initiated and this can be particularly useful when we aim to train a language model from scratch or benchmark the effect of pretraining.
|
4 |
+
|
5 |
+
It's important to note that tokenizer of random-roberta-tiny is the same as roberta-base because it's not a trivial task to get a random tokenizer and it's less meaningful compared to the random weight.
|
6 |
+
|
7 |
+
A debatable advantage of pulling random-roberta-tiny from Huggingface is to avoid using random seed in order to obtain the same randomness at each time.
|
8 |
+
|
9 |
+
The code to obtain such random model:
|
10 |
+
|
11 |
+
```python
|
12 |
+
|
13 |
+
from transformers import RobertaConfig, RobertaModel
|
14 |
+
|
15 |
+
|
16 |
+
def get_custom_blank_roberta(h=768, l=12):
|
17 |
+
|
18 |
+
# Initializing a RoBERTa configuration
|
19 |
+
configuration = RobertaConfig(num_attention_heads=h, num_hidden_layers=l)
|
20 |
+
|
21 |
+
# Initializing a model from the configuration
|
22 |
+
model = RobertaModel(configuration)
|
23 |
+
|
24 |
+
return model
|
25 |
+
|
26 |
+
rank="tiny"
|
27 |
+
h=128
|
28 |
+
l=2
|
29 |
+
model_type = "roberta"
|
30 |
+
tokenizer = AutoTokenizer.from_pretrained("roberta-base")
|
31 |
+
model_name ="random-"+model_type+"-"+rank
|
32 |
+
model = get_custom_blank_roberta(h, l)
|
33 |
+
```
|