Sabine
commited on
Commit
•
4347f41
1
Parent(s):
0131cb5
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# random-albert-base-v2
|
2 |
+
|
3 |
+
We introduce random-albert-base-v2, which is a unpretrained version of Albert model. The weight of random-albert-base-v2 is randomly initiated and this can be particularly useful when we aim to train a language model from scratch or benchmark the effect of pretraining.
|
4 |
+
|
5 |
+
It's important to note that tokenizer of random-albert-base-v2 is the same as albert-base-v2 because it's not a trivial task to get a random tokenizer and it's less meaningful compared to the random weight.
|
6 |
+
|
7 |
+
A debatable advantage of pulling random-albert-base-v2 from Huggingface is to avoid using random seed in order to obtain the same randomness at each time.
|
8 |
+
|
9 |
+
The code to obtain a such random model:
|
10 |
+
|
11 |
+
```python
|
12 |
+
|
13 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
14 |
+
|
15 |
+
def get_blank_model_from_hf(model_name="bert-base-cased"):
|
16 |
+
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=5)
|
17 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
18 |
+
model.base_model.init_weights()
|
19 |
+
model_name = "random-" + model_name
|
20 |
+
base_model= model.base_model
|
21 |
+
return base_model, tokenizer, model_name
|
22 |
+
```
|