Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,202 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tasks:
|
3 |
+
- Relation Extraction
|
4 |
+
widgets:
|
5 |
+
- examples:
|
6 |
+
- name: 1
|
7 |
+
title: Message-Topic(e1,e2)
|
8 |
+
inputs:
|
9 |
+
- name: token
|
10 |
+
data: ["the", "most", "common", "audits", "were", "about", "waste", "and", "recycling", "."]
|
11 |
+
- name: h
|
12 |
+
data:
|
13 |
+
- name: audits
|
14 |
+
pos: [3, 4]
|
15 |
+
- name: t
|
16 |
+
data:
|
17 |
+
- name: waste
|
18 |
+
pos: [6, 7]
|
19 |
+
|
20 |
+
- name: 2
|
21 |
+
title: Product-Producer(e2,e1)
|
22 |
+
inputs:
|
23 |
+
- name: token
|
24 |
+
data: ["the", "ombudsman", "'s", "report", "concluded", "that", "``", "a", "large", "part", "of", "the", "package", "was", "not", "provided", "''", "."]
|
25 |
+
- name: h
|
26 |
+
data:
|
27 |
+
- name: ombudsman
|
28 |
+
pos: [1, 2]
|
29 |
+
- name: t
|
30 |
+
data:
|
31 |
+
- name: report
|
32 |
+
pos: [3, 4]
|
33 |
+
- name: 3
|
34 |
+
title: Instrument-Agency(e2,e1)
|
35 |
+
inputs:
|
36 |
+
- name: token
|
37 |
+
data: ["many", "professional", "cartomancers", "use", "a", "regular", "deck", "of", "playing", "cards", "for", "divination", "."]
|
38 |
+
- name: h
|
39 |
+
data:
|
40 |
+
- name: cartomancers
|
41 |
+
pos: [2, 3]
|
42 |
+
- name: t
|
43 |
+
data:
|
44 |
+
- name: cards
|
45 |
+
pos: [9, 10]
|
46 |
+
|
47 |
+
- name: 4
|
48 |
+
title: Entity-Destination(e1,e2)
|
49 |
+
inputs:
|
50 |
+
- name: token
|
51 |
+
data: ["nasa", "kepler", "mission", "sends", "names", "into", "space", "."]
|
52 |
+
- name: h
|
53 |
+
data:
|
54 |
+
- name: oil
|
55 |
+
pos: [4, 5]
|
56 |
+
- name: t
|
57 |
+
data:
|
58 |
+
- name: ocean
|
59 |
+
pos: [7, 8]
|
60 |
+
|
61 |
+
- name: 5
|
62 |
+
title: Cause-Effect(e2,e1)
|
63 |
+
inputs:
|
64 |
+
- name: token
|
65 |
+
data: ["sorace", "was", "unaware", "that", "her", "anger", "was", "caused", "by", "the", "abuse", "."]
|
66 |
+
- name: h
|
67 |
+
data:
|
68 |
+
- name: anger
|
69 |
+
pos: [5, 6]
|
70 |
+
- name: t
|
71 |
+
data:
|
72 |
+
- name: abuse
|
73 |
+
pos: [10, 11]
|
74 |
+
|
75 |
+
- name: 6
|
76 |
+
title: Component-Whole(e1,e2)
|
77 |
+
inputs:
|
78 |
+
- name: token
|
79 |
+
data: ["the", "castle", "was", "inside", "a", "museum", "."]
|
80 |
+
- name: h
|
81 |
+
data:
|
82 |
+
- name: castle
|
83 |
+
pos: [1, 2]
|
84 |
+
- name: t
|
85 |
+
data:
|
86 |
+
- name: museum
|
87 |
+
pos: [5, 6]
|
88 |
+
|
89 |
+
domain:
|
90 |
+
- nlp
|
91 |
+
frameworks:
|
92 |
+
- pytorch
|
93 |
+
backbone:
|
94 |
+
- BERT large
|
95 |
+
metrics:
|
96 |
+
- accuracy
|
97 |
+
license: apache-2.0
|
98 |
+
language:
|
99 |
+
- ch
|
100 |
+
|
101 |
+
---
|
102 |
+
|
103 |
+
# KnowPrompt:Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction
|
104 |
+
KnowPrompt is used for relational extraction tasks, injecting latent knowledge contained in relation labels into prompt construction with learnable virtual template words and answer words , and synergistically optimize their representation with structured constraints.
|
105 |
+
|
106 |
+
## Model description
|
107 |
+
|
108 |
+
We take the first step to inject latent knowledge contained in relation labels into prompt construction,the knowledge extraction is then implemented with a Prompt-tuning model。The implementation is as follows:virtual template words around entities initialized using aggregate entity embeddings are used as learnable virtual template words to inject entity knowledge; Meanwhile, we leverage label to compute average embeddings as virtual answers words to inject relationship knowledge. In this structure, entities and relations are mutually constrained and virtual template and answer words should be contextually relevant, so we introduce synergistic optimization to correct virtual template and answer words.
|
109 |
+
![image.png](D:\Downloads\model.png)
|
110 |
+
|
111 |
+
## Intended uses & limitations
|
112 |
+
|
113 |
+
This model is used for relationship extraction tasks, and the extracted information can be used for more downstream NLP tasks, such as: information retrieval, conversation generation and Q&A. Please refer to the code example for details on how to use it.
|
114 |
+
|
115 |
+
The relationship labels in the model training data are limited and can only generalize the relationships in the real world to a certain extent.
|
116 |
+
|
117 |
+
## Training data
|
118 |
+
|
119 |
+
We adopt SemEval as the dataset
|
120 |
+
|
121 |
+
| **Dataset** | **# Train.** | **# Val.** | **# Test.** | **# Rel.** |
|
122 |
+
| ----------- | ------------ | ---------- | ----------- | ---------- |
|
123 |
+
| SemEval | 6,507 | 1,493 | 2,717 | 19 |
|
124 |
+
|
125 |
+
## Training procedure
|
126 |
+
|
127 |
+
### Training
|
128 |
+
|
129 |
+
The training is divided into two phases, and the first phase performs collaborative optimization of virtual template words and answer words
|
130 |
+
$$
|
131 |
+
\mathcal{J}=\mathcal{J}_{[\text {MASK }]}+\lambda \mathcal{J}_{\text {structured }},
|
132 |
+
$$
|
133 |
+
|
134 |
+
$\lambda$is the hyperparameter for weighing the two loss functions;The second stage optimizes all parameters with a smaller learning rate based on the optimized virtual template words and answer words, using only the loss function $\mathcal{J}_{texttt{[MASK]}}$to finetune the parameters for the language model.The hyperparameters are different for different datasets, as shown in the script file in the source code.Taking SemEval as an example, the hyperparameters are set as follows:
|
135 |
+
```
|
136 |
+
max_epochs=10
|
137 |
+
max_sequence_length=256
|
138 |
+
batch_size=16
|
139 |
+
learning_rate=3e-5
|
140 |
+
batch_size=16
|
141 |
+
t_lambda=0.001
|
142 |
+
```
|
143 |
+
### Data Evaluation and Results
|
144 |
+
The results of the comparison with other models in standard settings are shown in the following table.
|
145 |
+
| **Methods** | **Precision** |
|
146 |
+
| ----------- | ------------- |
|
147 |
+
| Fine-tuning | 87.6 |
|
148 |
+
| KnowBERT | 89.1 |
|
149 |
+
| MTB | 89.5 |
|
150 |
+
| PTR | 89.9 |
|
151 |
+
| KnowPrompt | 90.2 (+0.3) |
|
152 |
+
In low-resource settings,we performed the 8-, 16-, and 32-experiments.K instances of each class are sampled from the initial training and validation sets to form the training and validation sets for the FEW-shot. The results are as follows:
|
153 |
+
| Split | **Methods** | **Precision** |
|
154 |
+
| ----- | ----------- | ------------- |
|
155 |
+
| k=8 | Fine-tuning | 41.3 |
|
156 |
+
| | GDPNet | 42.0 |
|
157 |
+
| | PTR | 70.5 |
|
158 |
+
| | KnowPrompt | 74.3 (+33.0) |
|
159 |
+
| k=16 | Fine-tuning | 65.2 |
|
160 |
+
| | GDPNet | 67.5 |
|
161 |
+
| | PTR | 81.3 |
|
162 |
+
| | KnowPrompt | 82.9 (+17.7) |
|
163 |
+
| k=32 | Fine-tuning | 80.1 |
|
164 |
+
| | GDPNet | 81.2 |
|
165 |
+
| | PTR | 84.2 |
|
166 |
+
| | KnowPrompt | 84.8 (+4.7) |
|
167 |
+
As 𝐾 decreases from 32 to 8, the improvement in our KnowPrompt over the other three methods increases gradually.
|
168 |
+
#### BibTeX entry and citation info
|
169 |
+
```
|
170 |
+
@inproceedings{DBLP:conf/www/ChenZXDYTHSC22,
|
171 |
+
author = {Xiang Chen and
|
172 |
+
Ningyu Zhang and
|
173 |
+
Xin Xie and
|
174 |
+
Shumin Deng and
|
175 |
+
Yunzhi Yao and
|
176 |
+
Chuanqi Tan and
|
177 |
+
Fei Huang and
|
178 |
+
Luo Si and
|
179 |
+
Huajun Chen},
|
180 |
+
editor = {Fr{\'{e}}d{\'{e}}rique Laforest and
|
181 |
+
Rapha{\"{e}}l Troncy and
|
182 |
+
Elena Simperl and
|
183 |
+
Deepak Agarwal and
|
184 |
+
Aristides Gionis and
|
185 |
+
Ivan Herman and
|
186 |
+
Lionel M{\'{e}}dini},
|
187 |
+
title = {KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization
|
188 |
+
for Relation Extraction},
|
189 |
+
booktitle = {{WWW} '22: The {ACM} Web Conference 2022, Virtual Event, Lyon, France,
|
190 |
+
April 25 - 29, 2022},
|
191 |
+
pages = {2778--2788},
|
192 |
+
publisher = {{ACM}},
|
193 |
+
year = {2022},
|
194 |
+
url = {https://doi.org/10.1145/3485447.3511998},
|
195 |
+
doi = {10.1145/3485447.3511998},
|
196 |
+
timestamp = {Tue, 26 Apr 2022 16:02:09 +0200},
|
197 |
+
biburl = {https://dblp.org/rec/conf/www/ChenZXDYTHSC22.bib},
|
198 |
+
bibsource = {dblp computer science bibliography, https://dblp.org}
|
199 |
+
}
|
200 |
+
```bash
|
201 |
+
git clone https://www.modelscope.cn/jeno11/knowprompt_demo.git
|
202 |
+
```
|