File size: 8,258 Bytes
683d0f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
[
    {
        "dataset": "cola",
        "type": "task",
        "origin_acc": 69.0,
        "origin_prompt": "Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable':",
        "atk_acc": 14.0,
        "atk_prompt": "Assess the next sentence and decide if it is grammatically correct. Accommodate with 'Acceptable' or 'Unacceptable':",
        "transfer_ori_acc": 78.0,
        "transfer_atk_acc": 75.5
    },
    {
        "dataset": "cola",
        "type": "role",
        "origin_acc": 54.9,
        "origin_prompt": "As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable':",
        "atk_acc": 0.0,
        "atk_prompt": "Oj a toolbox for defines pronouns correcting, revisions the penalties and sorting its dictionary as either 'acceptable' or 'unacceptable':",
        "transfer_ori_acc": 74.9,
        "transfer_atk_acc": 67.5
    },
    {
        "dataset": "mrpc",
        "type": "task",
        "origin_acc": 67.16,
        "origin_prompt": "Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'.",
        "atk_acc": 0.49,
        "atk_prompt": "Are the meanings of the following couple of sentences the same? Cope with 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 82.1078431372549,
        "transfer_atk_acc": 81.61764705882352
    },
    {
        "dataset": "mrpc",
        "type": "role",
        "origin_acc": 39.46,
        "origin_prompt": "In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'.",
        "atk_acc": 0.0,
        "atk_prompt": "About the role of a textual comparison specialist, examine the given judgements and decide if they exchanging the same meaning, responses with 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 81.37254901960785,
        "transfer_atk_acc": 83.33333333333334
    },
    {
        "dataset": "qnli",
        "type": "task",
        "origin_acc": 30.5,
        "origin_prompt": "Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'.",
        "atk_acc": 0.0,
        "atk_prompt": "Reconsidered the given context and question, and ascertain if the context comprises enough information to favour the answer by choices 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 85.2,
        "transfer_atk_acc": 89.1
    },
    {
        "dataset": "qnli",
        "type": "role",
        "origin_acc": 35.3,
        "origin_prompt": "As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'.",
        "atk_acc": 0.6,
        "atk_prompt": "As a language expert, weighed if the delivered framework translates the answer to the question and respondent with 'entailment' or 'not_entailment'. ",
        "transfer_ori_acc": 88.4,
        "transfer_atk_acc": 83.6
    },
    {
        "dataset": "qqp",
        "type": "task",
        "origin_acc": 23.9,
        "origin_prompt": "Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'.",
        "atk_acc": 0.0,
        "atk_prompt": "Kan these two languages happen revisited equitable in signify? Meet with 'equivalent' or 'not_equivalent'. ",
        "transfer_ori_acc": 87.8,
        "transfer_atk_acc": 65.7
    },
    {
        "dataset": "qqp",
        "type": "role",
        "origin_acc": 16.9,
        "origin_prompt": "In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions.",
        "atk_acc": 0.1,
        "atk_prompt": "Onto the role of a question similarity analyzer, consider the significance of the envisaged matters and identify if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for many matters. ",
        "transfer_ori_acc": 87.1,
        "transfer_atk_acc": 85.9
    },
    {
        "dataset": "rte",
        "type": "task",
        "origin_acc": 47.65,
        "origin_prompt": "Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'.",
        "atk_acc": 0.72,
        "atk_prompt": "Invite classed the coitus between the foreseen priors as 'entailment' or 'not_entailment'.",
        "transfer_ori_acc": 85.1985559566787,
        "transfer_atk_acc": 84.83754512635379
    },
    {
        "dataset": "rte",
        "type": "role",
        "origin_acc": 48.01,
        "origin_prompt": "As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':",
        "atk_acc": 0.0,
        "atk_prompt": "Because an entailment characterization regimes, considering the login between the following sentence and meet with 'entailment' or 'not_entailment':",
        "transfer_ori_acc": 84.11552346570397,
        "transfer_atk_acc": 83.03249097472924
    },
    {
        "dataset": "sst2",
        "type": "task",
        "origin_acc": 53.9,
        "origin_prompt": "Please identify the emotional tone of this passage: 'positive' or 'negative'?",
        "atk_acc": 5.85,
        "atk_prompt": "Invited specify the affective tone of this pass: 'positive' or 'negative'? ",
        "transfer_ori_acc": 93.92201834862385,
        "transfer_atk_acc": 94.26605504587155
    },
    {
        "dataset": "sst2",
        "type": "role",
        "origin_acc": 42.55,
        "origin_prompt": "Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify:",
        "atk_acc": 3.21,
        "atk_prompt": "Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Ranked: ",
        "transfer_ori_acc": 95.29816513761467,
        "transfer_atk_acc": 94.72477064220183
    },
    {
        "dataset": "wnli",
        "type": "task",
        "origin_acc": 43.66,
        "origin_prompt": "Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.",
        "atk_acc": 1.41,
        "atk_prompt": "Define if the given pair of priors exhibits entailment or not_entailment. Answers with 'entailment' or 'not_entailment'.",
        "transfer_ori_acc": 61.97183098591549,
        "transfer_atk_acc": 64.7887323943662
    },
    {
        "dataset": "wnli",
        "type": "role",
        "origin_acc": 43.66,
        "origin_prompt": "In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment':",
        "atk_acc": 0.0,
        "atk_prompt": "Onto your role as an entailment analysing tool, appraisals the relationship between the given condemnation and classed it as 'entailment' or 'not_entailment':",
        "transfer_ori_acc": 56.33802816901409,
        "transfer_atk_acc": 59.154929577464785
    },
    {
        "dataset": "mmlu",
        "type": "task",
        "origin_acc": 27.07,
        "origin_prompt": "For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'.",
        "atk_acc": 12.72,
        "atk_prompt": "Among all multiple-choice subject about {}, find the accurate answer by chooses 'A', 'B', 'C', or 'D'. ",
        "transfer_ori_acc": 45.744680851063826,
        "transfer_atk_acc": 46.27659574468085
    },
    {
        "dataset": "mmlu",
        "type": "role",
        "origin_acc": 24.85,
        "origin_prompt": "As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'.",
        "atk_acc": 13.89,
        "atk_prompt": "As an expert in {}, answering to the following multiple-choice things by picking 'A', 'B', 'C', or 'D'.",
        "transfer_ori_acc": 44.680851063829785,
        "transfer_atk_acc": 44.680851063829785
    }
]