File size: 8,523 Bytes
683d0f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
[
    {
        "dataset": "cola",
        "type": "task",
        "origin_acc": 80.5,
        "origin_prompt": "Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable'.",
        "atk_acc": 78.5,
        "atk_prompt": "Examine the given sentence and decide if it is gramatically sound. nAswer with either 'Acceptable' or 'Unacceptable'. ",
        "transfer_ori_acc": 0.8999999999999999,
        "transfer_atk_acc": 0.4
    },
    {
        "dataset": "cola",
        "type": "role",
        "origin_acc": 81.5,
        "origin_prompt": "Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable'.",
        "atk_acc": 78.5,
        "atk_prompt": "Funtioning as a graMmar evaluation tool, analyzY the given sentence and decide if it is grammatically corLect, responding with 'acceptable' or 'unacceptable'. ",
        "transfer_ori_acc": 23.9,
        "transfer_atk_acc": 53.7
    },
    {
        "dataset": "mrpc",
        "type": "task",
        "origin_acc": 74.5,
        "origin_prompt": "Examine the pair of sentences and decide if their meanings are identical by answering with 'equivalent' or 'not_equivalent'.",
        "atk_acc": 65.5,
        "atk_prompt": "Examine the Kair of sentences and edcide if their meanijngs are ibentical by answering with 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 55.63725490196079,
        "transfer_atk_acc": 68.87254901960785
    },
    {
        "dataset": "mrpc",
        "type": "role",
        "origin_acc": 72.0,
        "origin_prompt": "As a linguistic comparator, review the following pair of sentences and determine their semantic equivalence by choosing 'equivalent' or 'not_equivalent'.",
        "atk_acc": 66.5,
        "atk_prompt": "AV a linguistic comaprator, review the following pair of sentences and determine their semayntic equivalence by chhoosing 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 61.51960784313726,
        "transfer_atk_acc": 64.2156862745098
    },
    {
        "dataset": "qnli",
        "type": "task",
        "origin_acc": 85.0,
        "origin_prompt": "Examine the context and question, and determine if the context logically implies the answer by selecting 'entailment' or 'not_entailment'.",
        "atk_acc": 77.5,
        "atk_prompt": "Examine the contedt and question, and determine if the cXontext logically implies the answer by selecting 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 26.900000000000002,
        "transfer_atk_acc": 34.300000000000004
    },
    {
        "dataset": "qnli",
        "type": "role",
        "origin_acc": 86.0,
        "origin_prompt": "As a textual analyst, examine if the given context logically implies the answer to the question and indicate your decision with 'entailment' or 'not_entailment'.",
        "atk_acc": 75.0,
        "atk_prompt": "A a textual analyst, xamine if the gPven conSext logiclaly implies the asnwer to the qbuestion and indicate your decision with 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 31.4,
        "transfer_atk_acc": 43.6
    },
    {
        "dataset": "qqp",
        "type": "task",
        "origin_acc": 82.0,
        "origin_prompt": "Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'.",
        "atk_acc": 80.0,
        "atk_prompt": "Can these two statements be considJred equal in meaning? Answer with 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 36.3,
        "transfer_atk_acc": 29.7
    },
    {
        "dataset": "qqp",
        "type": "role",
        "origin_acc": 83.0,
        "origin_prompt": "Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones.",
        "atk_acc": 81.5,
        "atk_prompt": "Actinm as a question equivalence instrumetn, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones. ",
        "transfer_ori_acc": 34.599999999999994,
        "transfer_atk_acc": 30.5
    },
    {
        "dataset": "rte",
        "type": "task",
        "origin_acc": 85.0,
        "origin_prompt": "Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.",
        "atk_acc": 80.5,
        "atk_prompt": "Does the relationshBp between the given sentences repersent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 46.20938628158845,
        "transfer_atk_acc": 45.84837545126354
    },
    {
        "dataset": "rte",
        "type": "role",
        "origin_acc": 84.5,
        "origin_prompt": "In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment'.",
        "atk_acc": 81.0,
        "atk_prompt": "In your role as an entailment analysis toIol, aZssess the relationship between the given sentences and cassify it as 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 46.931407942238266,
        "transfer_atk_acc": 40.794223826714806
    },
    {
        "dataset": "sst2",
        "type": "task",
        "origin_acc": 97.0,
        "origin_prompt": "Evaluate the sentiment of the given text and classify it as 'positive' or 'negative'.",
        "atk_acc": 94.0,
        "atk_prompt": "ENvaluate the senttment of the giOven text and classify it as 'positive' or 'negative'. ",
        "transfer_ori_acc": 10.435779816513762,
        "transfer_atk_acc": 2.408256880733945
    },
    {
        "dataset": "sst2",
        "type": "role",
        "origin_acc": 97.0,
        "origin_prompt": "As a sentiment classifier, determine whether the following text is 'positive' or 'negative'.",
        "atk_acc": 94.0,
        "atk_prompt": "qAs a setniment classifier, determnne whether the following text is 'positive' or 'negative'. ",
        "transfer_ori_acc": 15.36697247706422,
        "transfer_atk_acc": 7.912844036697248
    },
    {
        "dataset": "wnli",
        "type": "task",
        "origin_acc": 67.61,
        "origin_prompt": "Indicate if the connection between the following sentences is 'entailment' or 'not_entailment'.",
        "atk_acc": 64.79,
        "atk_prompt": "Indicate if the connection between the following sentences is 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true . ",
        "transfer_ori_acc": 9.859154929577464,
        "transfer_atk_acc": 0.0
    },
    {
        "dataset": "wnli",
        "type": "role",
        "origin_acc": 64.79,
        "origin_prompt": "In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment'.",
        "atk_acc": 59.15,
        "atk_prompt": "In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true . ",
        "transfer_ori_acc": 23.943661971830984,
        "transfer_atk_acc": 11.267605633802818
    },
    {
        "dataset": "mmlu",
        "type": "task",
        "origin_acc": 66.5,
        "origin_prompt": "In relation to the multiple-choice question on {}, please provide the accurate answer by choosing 'A', 'B', 'C', or 'D'.",
        "atk_acc": 63.5,
        "atk_prompt": "ISn ralation to the mulitple-choice question on {}, pleare provide the accurate answer by choosing 'A', 'B', 'C', or 'D'. ",
        "transfer_ori_acc": 9.397163120567376,
        "transfer_atk_acc": 5.141843971631205
    },
    {
        "dataset": "mmlu",
        "type": "role",
        "origin_acc": 65.5,
        "origin_prompt": "As someone well-versed in {}, please address the multiple-choice question below by selecting 'A', 'B', 'C', or 'D'.",
        "atk_acc": 62.5,
        "atk_prompt": "As someone well-versed in {}, please address the mulZtiple-choice question below by selecting 'A', 'B', 'C', or 'D'.",
        "transfer_ori_acc": 8.51063829787234,
        "transfer_atk_acc": 11.702127659574469
    }
]