wyu97
commited on
Commit
•
968c177
1
Parent(s):
2c6d949
upload model
Browse files- README.md +35 -0
- config.json +24 -0
- special_tokens_map.json +1 -0
- tokenizer_config.json +1 -0
- training_args.bin +0 -0
- vocab.txt +0 -0
README.md
CHANGED
@@ -1,3 +1,38 @@
|
|
1 |
---
|
2 |
license: cc-by-4.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: cc-by-4.0
|
3 |
---
|
4 |
+
|
5 |
+
# DictBERT model (uncased)
|
6 |
+
|
7 |
+
-- This is the model checkpoint of our [ACL 2022](https://www.2022.aclweb.org/) paper "*Dict-BERT: Enhancing Language Model Pre-training with Dictionary*" [\[PDF\]](https://aclanthology.org/2022.findings-acl.150/).
|
8 |
+
In this paper, we propose DictBERT, a novel pre-trained language model by leveraging rare word definitions in English dictionaries (e.g., Wiktionary). DictBERT is based on the BERT architecture, trained under the same setting as BERT. Please refer more details in our paper.
|
9 |
+
|
10 |
+
## Evaluation results
|
11 |
+
|
12 |
+
When fine-tuned DictBERT on downstream tasks, this model achieves the following results:
|
13 |
+
|
14 |
+
CoLA is evaluated by matthews; STS-B is evaluated by pearson; others are evaluated by accuracy.
|
15 |
+
|
16 |
+
| | MNLI | QNLI | QQP | SST-2 | CoLA | MRPC | RTE | STS-B | Average |
|
17 |
+
|:----:|:-----------:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|:-------:|
|
18 |
+
| BERT(HF) | 84.12 | 90.69 | 90.75 | 92.52 | 58.89 | 86.17 | 68.67 | 89.39 | 82.65 |
|
19 |
+
| DictBERT | 84.36 | 91.02 | 90.78 | 92.43 | 61.81 | 87.25 | 72.90 | 89.40 | 83.74 |
|
20 |
+
|
21 |
+
|
22 |
+
HF: huggingface checkpoint for BERT-base uncased
|
23 |
+
|
24 |
+
### BibTeX entry and citation info
|
25 |
+
|
26 |
+
```bibtex
|
27 |
+
@inproceedings{yu2022dict,
|
28 |
+
title={Dict-BERT: Enhancing Language Model Pre-training with Dictionary},
|
29 |
+
author={Yu, Wenhao and Zhu, Chenguang and Fang, Yuwei and Yu, Donghan and Wang, Shuohang and Xu, Yichong and Zeng, Michael and Jiang, Meng},
|
30 |
+
booktitle={Findings of the Association for Computational Linguistics: ACL 2022},
|
31 |
+
pages={1907--1918},
|
32 |
+
year={2022}
|
33 |
+
}
|
34 |
+
```
|
35 |
+
|
36 |
+
<a href="https://huggingface.co/exbert/?model=bert-base-uncased">
|
37 |
+
<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
|
38 |
+
</a>
|
config.json
ADDED
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "dictbert-base-uncased",
|
3 |
+
"architectures": [
|
4 |
+
"BertForMaskedLM"
|
5 |
+
],
|
6 |
+
"attention_probs_dropout_prob": 0.1,
|
7 |
+
"gradient_checkpointing": false,
|
8 |
+
"hidden_act": "gelu",
|
9 |
+
"hidden_dropout_prob": 0.1,
|
10 |
+
"hidden_size": 768,
|
11 |
+
"initializer_range": 0.02,
|
12 |
+
"intermediate_size": 3072,
|
13 |
+
"layer_norm_eps": 1e-12,
|
14 |
+
"max_position_embeddings": 512,
|
15 |
+
"model_type": "bert",
|
16 |
+
"num_attention_heads": 12,
|
17 |
+
"num_hidden_layers": 12,
|
18 |
+
"pad_token_id": 0,
|
19 |
+
"position_embedding_type": "absolute",
|
20 |
+
"transformers_version": "4.7.0.dev0",
|
21 |
+
"type_vocab_size": 2,
|
22 |
+
"use_cache": true,
|
23 |
+
"vocab_size": 30522
|
24 |
+
}
|
special_tokens_map.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
|
tokenizer_config.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"do_lower_case": true, "do_basic_tokenize": true, "never_split": null, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "use_fast": true, "model_max_length": 512, "special_tokens_map_file": null, "name_or_path": "bert-base-uncased"}
|
training_args.bin
ADDED
Binary file (2.61 kB). View file
|
|
vocab.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|