File size: 11,896 Bytes
683d0f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
[
    {
        "dataset": "cola",
        "type": "task",
        "origin_acc": 86.4,
        "origin_prompt": "Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable'.",
        "atk_acc": 68.1,
        "atk_prompt": "Examine the given condemnation and decide if it is pronouns sound. Answer with either 'Acceptable' or 'Unacceptable'. ",
        "transfer_ori_acc": 77.7,
        "transfer_atk_acc": 66.0
    },
    {
        "dataset": "cola",
        "type": "role",
        "origin_acc": 86.9,
        "origin_prompt": "In the capacity of a grammar assessment system, indicate if the structure of the provided sentence is grammatically correct, responding with 'acceptable' or 'unacceptable'.",
        "atk_acc": 85.0,
        "atk_prompt": "In the ability of a grammar assessment system, denotes if the structure of the provided sentence is grammatical corrected, responding with 'acceptable' or 'unacceptable'. ",
        "transfer_ori_acc": 74.6,
        "transfer_atk_acc": 74.9
    },
    {
        "dataset": "mrpc",
        "type": "task",
        "origin_acc": 87.01,
        "origin_prompt": "Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent'.",
        "atk_acc": 83.82,
        "atk_prompt": "Can the given sentences be deemed semantically idntical? Please reply with 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 82.35294117647058,
        "transfer_atk_acc": 80.3921568627451
    },
    {
        "dataset": "mrpc",
        "type": "role",
        "origin_acc": 87.5,
        "origin_prompt": "In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent'.",
        "atk_acc": 83.58,
        "atk_prompt": "In your capacity as a language ɑnalyst, assess the following sentenes and classify their metaphor as 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 77.69607843137256,
        "transfer_atk_acc": 78.18627450980392
    },
    {
        "dataset": "qnli",
        "type": "task",
        "origin_acc": 94.0,
        "origin_prompt": "Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'.",
        "atk_acc": 92.9,
        "atk_prompt": "Review the given coոtext and question, and dcide if the context contains enough particulars to support the answeⲅ by selecting 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 85.2,
        "transfer_atk_acc": 89.2
    },
    {
        "dataset": "qnli",
        "type": "role",
        "origin_acc": 93.7,
        "origin_prompt": "As a semantic interpreter, assess whether the provided context supports the answer to the given question and answer with 'entailment' or 'not_entailment'.",
        "atk_acc": 92.3,
        "atk_prompt": "Iike a semantic interpretation, evaluated lf the prоvided context supports the answer to the given question and answers with 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 88.0,
        "transfer_atk_acc": 61.8
    },
    {
        "dataset": "qqp",
        "type": "task",
        "origin_acc": 88.8,
        "origin_prompt": "Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'.",
        "atk_acc": 87.6,
        "atk_prompt": "Defining if the given paired of statements can be considered the same by responds with 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 86.8,
        "transfer_atk_acc": 87.0
    },
    {
        "dataset": "qqp",
        "type": "role",
        "origin_acc": 90.0,
        "origin_prompt": "As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent'.",
        "atk_acc": 87.6,
        "atk_prompt": "As a tool for deciding queѕtion equivalence, review the questions and classifications their portrait as neither 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 86.3,
        "transfer_atk_acc": 86.1
    },
    {
        "dataset": "rte",
        "type": "task",
        "origin_acc": 93.5,
        "origin_prompt": "Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'.",
        "atk_acc": 92.42,
        "atk_prompt": "Identifying whether the given pair of sentencing demonstrates entailment or not_entailment. Respond with 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 84.83754512635379,
        "transfer_atk_acc": 84.47653429602889
    },
    {
        "dataset": "rte",
        "type": "role",
        "origin_acc": 93.86,
        "origin_prompt": "Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.",
        "atk_acc": 90.97,
        "atk_prompt": "Working as an entailment classifier, identify both the given piar of sentences displaying entailment or not_entailment. Respond with 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 83.75451263537906,
        "transfer_atk_acc": 80.86642599277978
    },
    {
        "dataset": "sst2",
        "type": "task",
        "origin_acc": 96.79,
        "origin_prompt": "Determine the overall sentiment of this sentence, categorizing it as 'positive' or 'negative'.",
        "atk_acc": 95.87,
        "atk_prompt": "Determine the general sentiment of this condemnation, categorizing it as 'positive' or 'negative'. ",
        "transfer_ori_acc": 94.38073394495413,
        "transfer_atk_acc": 95.52752293577981
    },
    {
        "dataset": "sst2",
        "type": "role",
        "origin_acc": 96.56,
        "origin_prompt": "As a sentiment classifier, determine whether the following text is 'positive' or 'negative'.",
        "atk_acc": 95.18,
        "atk_prompt": "As a emotion classifier, determine whether the follоwing wording is 'positive' or 'negative'. ",
        "transfer_ori_acc": 94.4954128440367,
        "transfer_atk_acc": 94.61009174311926
    },
    {
        "dataset": "wnli",
        "type": "task",
        "origin_acc": 77.46,
        "origin_prompt": "Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'.",
        "atk_acc": 76.06,
        "atk_prompt": "Assss the connection between the following sentences and classify it as 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 64.7887323943662,
        "transfer_atk_acc": 63.38028169014085
    },
    {
        "dataset": "wnli",
        "type": "role",
        "origin_acc": 76.06,
        "origin_prompt": "In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment'.",
        "atk_acc": 71.83,
        "atk_prompt": "In your role as an entailment analysis tool, assess the relationship between the given punishments and categorize it as 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 57.74647887323944,
        "transfer_atk_acc": 60.56338028169014
    },
    {
        "dataset": "mmlu",
        "type": "task",
        "origin_acc": 53.9,
        "origin_prompt": "Below are multiple-choice question concerning {}. Indicate your response with 'A', 'B', 'C', or 'D'.",
        "atk_acc": 52.3,
        "atk_prompt": "Belоw are multipl-echoice quеstion concening {}. Indicate your response with 'A', 'B', 'C', or 'D'. ",
        "transfer_ori_acc": 45.39007092198582,
        "transfer_atk_acc": 44.32624113475177
    },
    {
        "dataset": "mmlu",
        "type": "role",
        "origin_acc": 54.08,
        "origin_prompt": "Considering your familiarity with {}, attend to the following multiple-choice question by picking 'A', 'B', 'C', or 'D'.",
        "atk_acc": 52.48,
        "atk_prompt": "Considering your familiarity with {}, attend to the following mutliple-choice matter by picking 'A', 'B', 'C', or 'D'.",
        "transfer_ori_acc": 46.98581560283688,
        "transfer_atk_acc": 46.09929078014184
    },
    {
        "dataset": "squad_v2",
        "type": "task",
        "origin_acc": 91.05,
        "origin_prompt": "Identify the most relevant answer from the context. If it's not possible to find an answer, state 'unanswerable'.",
        "atk_acc": 89.38,
        "atk_prompt": "Identify the most reevant answer from the context. If it's not achievable to find an answer, state 'unanswerable'.",
        "transfer_ori_acc": 87.01356406751144,
        "transfer_atk_acc": 86.54992770387507
    },
    {
        "dataset": "squad_v2",
        "type": "role",
        "origin_acc": 90.55,
        "origin_prompt": "Being well-versed in the context, please derive the most fitting answer. If there isn't an answer in the context, use 'unanswerable'.",
        "atk_acc": 80.92,
        "atk_prompt": "Underway well-versed in the context, please subtract the most fitting answer. If there isn't an answer in the framework, use 'unanswerable'.",
        "transfer_ori_acc": 87.04992770387507,
        "transfer_atk_acc": 87.31659437054175
    },
    {
        "dataset": "iwslt",
        "type": "task",
        "origin_acc": 0.27,
        "origin_prompt": "Accurately translate the sentence from {} to {}, ensuring the meaning remains intact.",
        "atk_acc": 0.23,
        "atk_prompt": "Acϲurately translate the sentence from {} to {}, ensuring the meaning remains unaffected. ",
        "transfer_ori_acc": 0.2195336520895316,
        "transfer_atk_acc": 0.21684835046261514
    },
    {
        "dataset": "iwslt",
        "type": "role",
        "origin_acc": 0.27,
        "origin_prompt": "Taking on the role of a certified translator, perform a translation of the given sentence from {} to {}.",
        "atk_acc": 0.23,
        "atk_prompt": "Taking on the role of a certifying translators, pefrorm a translatiоn of the givеn sentence from {} to {}. ",
        "transfer_ori_acc": 0.21945913242821213,
        "transfer_atk_acc": 0.223790160931717
    },
    {
        "dataset": "un_multi",
        "type": "task",
        "origin_acc": 0.36,
        "origin_prompt": "Perform a machine translation for the provided sentence, changing it from {} to {}.",
        "atk_acc": 0.35,
        "atk_prompt": "Preform a machinery translation for the providеd sentence, changing it from {} to {}. ",
        "transfer_ori_acc": 0.30870734220870066,
        "transfer_atk_acc": 0.30758829636925916
    },
    {
        "dataset": "un_multi",
        "type": "role",
        "origin_acc": 0.36,
        "origin_prompt": "As a skilled linguist, please convert the following sentence from {} to {}.",
        "atk_acc": 0.35,
        "atk_prompt": "As a skillful linguist, pelase convеrt the following sentence from {} to {}. ",
        "transfer_ori_acc": 0.3046140486156621,
        "transfer_atk_acc": 0.3072428127506268
    },
    {
        "dataset": "math",
        "type": "task",
        "origin_acc": 16.88,
        "origin_prompt": "Figure out the solution for the following mathematical task with {}.",
        "atk_acc": 15.0,
        "atk_prompt": "Figure out the solution for the followinɡ mathematical task with {}. ",
        "transfer_ori_acc": 17.5,
        "transfer_atk_acc": 17.5
    },
    {
        "dataset": "math",
        "type": "role",
        "origin_acc": 16.88,
        "origin_prompt": "Pretend you are a math teacher and solve this mathematical question concerning {}.",
        "atk_acc": 13.75,
        "atk_prompt": "Pretend you are a math teacher and settle this matehmatical question on {}. ",
        "transfer_ori_acc": 16.875,
        "transfer_atk_acc": 15.0
    }
]