readme updated
Browse files
README.md
CHANGED
@@ -18,6 +18,23 @@ Model architecture is similar to bert-medium (8 layers, 8 heads, and 512 hidden
|
|
18 |
The details can be found at this paper:
|
19 |
https://arxiv.org/...
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
### BibTeX entry and citation info
|
22 |
```bibtex
|
23 |
@article{}
|
|
|
18 |
The details can be found at this paper:
|
19 |
https://arxiv.org/...
|
20 |
|
21 |
+
The following code can be used for model loading and tokenization, example max length (514) can be changed:
|
22 |
+
```
|
23 |
+
model = AutoModel.from_pretrained([model_path])
|
24 |
+
#for sequence classification:
|
25 |
+
#model = AutoModelForSequenceClassification.from_pretrained([model_path], num_labels=[num_classes])
|
26 |
+
|
27 |
+
tokenizer = PreTrainedTokenizerFast(tokenizer_file=[file_path])
|
28 |
+
tokenizer.mask_token = "[MASK]"
|
29 |
+
tokenizer.cls_token = "[CLS]"
|
30 |
+
tokenizer.sep_token = "[SEP]"
|
31 |
+
tokenizer.pad_token = "[PAD]"
|
32 |
+
tokenizer.unk_token = "[UNK]"
|
33 |
+
tokenizer.bos_token = "[CLS]"
|
34 |
+
tokenizer.eos_token = "[SEP]"
|
35 |
+
tokenizer.model_max_length = 514
|
36 |
+
```
|
37 |
+
|
38 |
### BibTeX entry and citation info
|
39 |
```bibtex
|
40 |
@article{}
|