File size: 2,217 Bytes
561aa39
 
 
968c177
 
 
 
5af0b57
968c177
f4caa8f
 
968c177
 
9abf438
 
968c177
 
 
 
 
 
 
 
38c3525
9abf438
 
 
 
 
 
968c177
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: cc-by-4.0
---

# DictBERT model (uncased)

-- This is the model checkpoint of our [ACL 2022](https://www.2022.aclweb.org/) paper "*Dict-BERT: Enhancing Language Model Pre-training with Dictionary*" [\[PDF\]](https://aclanthology.org/2022.findings-acl.150/). 
In this paper, we propose DictBERT, which is a novel pre-trained language model by leveraging rare word definitions in English dictionaries (e.g., Wiktionary). DictBERT is based on the BERT architecture, trained under the same setting as BERT. Please refer more details in our paper. 

-- See code for fine-tuning the model at https://github.com/wyu97/DictBERT

## Evaluation results

We show performance of fine-tuning BERT and DictBERT on the GLEU benchmarks tasks. CoLA is evaluated by matthews, STS-B is evaluated by pearson, and other tasks are evaluated by accuracy. The models achieve the following results:


| | MNLI | QNLI | QQP | SST-2 | CoLA | MRPC | RTE | STS-B | Average |
|:----:|:-----------:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|:-------:|
| BERT(HF) | 84.12 | 90.69 | 90.75 | 92.52 | 58.89 | 86.17 | 68.67 | 89.39 | 82.65 |
| DictBERT | 84.36 | 91.02 | 90.78 | 92.43 | 61.81 | 87.25 | 72.90 | 89.40 | 83.74 |

HF: huggingface checkpoint for BERT-base uncased

If no dictionary is provided during fine-tuning (i.e., the same as BERT fine-tuning), DictBERT can still achieve better performance than BERT. 

| | MNLI | QNLI | QQP | SST-2 | CoLA | MRPC | RTE | STS-B | Average |
|:----:|:-----------:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|:-------:|
| w/o dict | 84.24 | 90.99 | 90.80 | 92.51 | 60.50 | 87.04 | 73.75 | 89.37 | 83.69 |


### BibTeX entry and citation info

```bibtex
@inproceedings{yu2022dict,
  title={Dict-BERT: Enhancing Language Model Pre-training with Dictionary},
  author={Yu, Wenhao and Zhu, Chenguang and Fang, Yuwei and Yu, Donghan and Wang, Shuohang and Xu, Yichong and Zeng, Michael and Jiang, Meng},
  booktitle={Findings of the Association for Computational Linguistics: ACL 2022},
  pages={1907--1918},
  year={2022}
}
```

<a href="https://huggingface.co/exbert/?model=bert-base-uncased">
	<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
</a>