File size: 234,624 Bytes
1c79925
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
# vicuna_zeroshot

# cola

## 10 prompts

Acc: 69.00%, prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable':
Acc: 62.30%, prompt: Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable':
Acc: 60.90%, prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable':
Acc: 59.70%, prompt: Please evaluate the grammatical structure of the provided sentence and answer with 'Acceptable' or 'Unacceptable':
Acc: 50.40%, prompt: Assess the grammatical structure of the given sentence and classify it as 'Acceptable' or 'Unacceptable':
Acc: 50.10%, prompt: Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable':
Acc: 37.20%, prompt: Check the grammar of the following sentence and indicate if it is 'Acceptable' or 'Unacceptable':
Acc: 36.50%, prompt: Determine if the grammar of the given sentence is 'Acceptable' or 'Unacceptable':
Acc: 30.60%, prompt: Examine the sentence and decide if its grammar is 'Acceptable' or 'Unacceptable':
Acc: 28.90%, prompt: Review the sentence below and identify whether its grammar is 'Acceptable' or 'Unacceptable':

Acc: 57.10%, prompt: Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable':
Acc: 54.90%, prompt: As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable':
Acc: 50.70%, prompt: In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones:
Acc: 48.10%, prompt: In the capacity of a grammar assessment system, indicate if the structure of the provided sentence is grammatically correct, responding with 'acceptable' or 'unacceptable':
Acc: 46.30%, prompt: Acting as a grammar detection instrument, determine if the provided sentence is grammatically sound, answering with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar:
Acc: 43.00%, prompt: In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect:
Acc: 36.10%, prompt: As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar:
Acc: 35.10%, prompt: Working as a grammar classifier, identify whether the given sentence has correct grammar, and respond with 'acceptable' for correct sentences or 'unacceptable' for incorrect ones:
Acc: 27.10%, prompt: While performing grammar analysis, classify the grammar of the following sentence as 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar:
Acc: 9.60%, prompt: As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones:

## bertattack

Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"check the below verb and consider if it is grammatically complete. act with 'Acceptable' or 'Unacceptable':"
Original acc: 69.00%, attacked acc: 1.70%, dropped acc: 67.30%

Original prompt: Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable':
Attacked prompt: b"first the given sentence and identify its grammatical correct, as 'Acceptable' or 'Unacceptable':"
Original acc: 62.30%, attacked acc: 2.20%, dropped acc: 60.10%

Original prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"is the program sentence grammatically correct? select with 'Acceptable' or 'Unacceptable':"
Original acc: 60.90%, attacked acc: 4.60%, dropped acc: 56.30%

Original prompt: Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable':
Attacked prompt: b"as as a grammar evaluation resource, analyze the next phrase and decide if it is grammatical+ correct, respond with 'acceptable' or 'unacceptable':"
Original acc: 57.10%, attacked acc: 0.60%, dropped acc: 56.50%

Original prompt: As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable':
Attacked prompt: b"being a test for determining grammatical correct-, research the category and categorize its singular as if 'acceptable' or 'unacceptable':"
Original acc: 54.90%, attacked acc: 4.10%, dropped acc: 50.80%

Original prompt: In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones:
Attacked prompt: b"s the role of a syntax analyzemirror, examine the complement of the provided sentence and test if it is defective, answering with 'acceptable' for grammatically correct expressions or 'unacceptable' for unacceptable none:"
Original acc: 50.70%, attacked acc: 1.10%, dropped acc: 49.60%

## checklist

Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable' MQnugHcaoy:"
Original acc: 69.00%, attacked acc: 2.90%, dropped acc: 66.10%

Original prompt: Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable' 5yWbBXztUY:"
Original acc: 62.30%, attacked acc: 34.90%, dropped acc: 27.40%

Original prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable' lf8wBa2yEm:"
Original acc: 60.90%, attacked acc: 20.90%, dropped acc: 40.00%

Original prompt: Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable':
Attacked prompt: b"Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable' MQnugHcaoy:"
Original acc: 57.10%, attacked acc: 11.30%, dropped acc: 45.80%

Original prompt: As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable':
Attacked prompt: b"As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable' w52rwgo0Av:"
Original acc: 54.90%, attacked acc: 11.10%, dropped acc: 43.80%

Original prompt: In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones:
Attacked prompt: b"In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones Q47Jmd4lMV:"
Original acc: 50.70%, attacked acc: 7.50%, dropped acc: 43.20%

## deepwordbug

Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"sAsess the ofllowing lsentence and determine if it is grammatiaclly correct. Rspond with 'Acceptable' or 'Unacceptable':"
Original acc: 69.20%, attacked acc: 3.60%, dropped acc: 65.60%

Original prompt: Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable':"
Original acc: 62.20%, attacked acc: 62.20%, dropped acc: 7.00%

Original prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"If the pryvided sentenKe Zrammatically crorect? Respod with 'Acceptable' or 'Unacceptable':"
Original acc: 60.90%, attacked acc: 25.50%, dropped acc: 35.40%

Original prompt: Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable':
Attacked prompt: b"FunFctioning as a grammar zvaluation Fool, analyze the given sentence and deGcide if it is arammatically corret, rWesponding with 'acceptable' or 'unacceptable':"
Original acc: 57.10%, attacked acc: 5.10%, dropped acc: 52.00%

Original prompt: As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable':
Attacked prompt: b"A a tol for determining grammatical correktness, kreview the sentence and cantegorize its gKrammar as either 'acceptable' or 'unacceptable':"
Original acc: 54.90%, attacked acc: 13.20%, dropped acc: 41.70%

Original prompt: In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones:
Attacked prompt: b"n the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorret oKnes:"
Original acc: 50.60%, attacked acc: 4.80%, dropped acc: 45.80%

## stresstest

Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable'  and false is not true :"
Original acc: 69.00%, attacked acc: 34.80%, dropped acc: 34.20%

Original prompt: Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable'  and true is true  and true is true  and true is true  and true is true  and true is true :"
Original acc: 62.30%, attacked acc: 62.10%, dropped acc: 0.20%

Original prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable'  and false is not true :"
Original acc: 60.90%, attacked acc: 39.70%, dropped acc: 21.20%

Original prompt: Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable':
Attacked prompt: b"Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable'  and false is not true :"
Original acc: 57.10%, attacked acc: 20.80%, dropped acc: 36.30%

Original prompt: As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable':
Attacked prompt: b"As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable'  and false is not true :"
Original acc: 54.90%, attacked acc: 49.60%, dropped acc: 5.30%

Original prompt: In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones:
Attacked prompt: b"In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones  and false is not true :"
Original acc: 50.70%, attacked acc: 29.00%, dropped acc: 21.70%

## textbugger

Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Evaluating the later sen\xf0\x9d\x9a\x9dence and determine if it is grammatically correct. Respo\xd5\xb8d with 'Acceptable' or 'Unacceptable':"
Original acc: 69.00%, attacked acc: 0.00%, dropped acc: 69.00%

Original prompt: Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable':
Attacked prompt: b"A\xd5\xb8alyze the provided sentence and classi\xf0\x9d\x9a\x8fy its grammatical correctness as 'Acceptable' or 'Unacceptable':"
Original acc: 62.30%, attacked acc: 0.00%, dropped acc: 62.30%

Original prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Comes the provided sen\xf0\x9d\x9a\x9dence grammatically correct? Respo\xd5\xb8d with 'Acceptable' or 'Unacceptable':"
Original acc: 60.90%, attacked acc: 0.00%, dropped acc: 60.90%

Original prompt: Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable':
Attacked prompt: b"Func\xf0\x9d\x9a\x9dioning as a \xc9\xa1rammar evaluaion tool, analyze the yielded sentence and decide if it is grammatically correc\xf0\x9d\x9a\x9d, responding with 'acceptable' or 'unacceptable':"
Original acc: 57.10%, attacked acc: 1.10%, dropped acc: 56.00%

Original prompt: As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable':
Attacked prompt: b"A\xd1\x95 a tool for determining grammatical regularity, reviewing the penalties and categoriz\xd0\xb5 its grammar as ei\xf0\x9d\x9a\x9dher 'acceptable' or 'unacceptable':"
Original acc: 54.90%, attacked acc: 0.00%, dropped acc: 54.90%

Original prompt: In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones:
Attacked prompt: b"In the role of a syntax analyzer, examine the grammar of the provided sentence and decie if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrec\xf0\x9d\x9a\x9d ones:"
Original acc: 50.70%, attacked acc: 0.00%, dropped acc: 50.70%

## textfooler

Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Assess the next sentence and decide if it is grammatically correct. Accommodate with 'Acceptable' or 'Unacceptable':"
Original acc: 69.00%, attacked acc: 14.00%, dropped acc: 55.00%

Original prompt: Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Browse the handed pity and hierarchy its grammatical correction as 'Acceptable' or 'Unacceptable':"
Original acc: 62.30%, attacked acc: 18.90%, dropped acc: 43.40%

Original prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable':
Attacked prompt: b"Does the provided penalty grammatically appropriate? Cater with 'Acceptable' or 'Unacceptable':"
Original acc: 60.90%, attacked acc: 26.30%, dropped acc: 34.60%

Original prompt: Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable':
Attacked prompt: b"Working as a language assessment apparatus, discuss the given sentence and opt if it is grammatically rectified, respond with 'acceptable' or 'unacceptable':"
Original acc: 57.10%, attacked acc: 2.70%, dropped acc: 54.40%

Original prompt: As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable':
Attacked prompt: b"Oj a toolbox for defines pronouns correcting, revisions the penalties and sorting its dictionary as either 'acceptable' or 'unacceptable':"
Original acc: 54.90%, attacked acc: 0.00%, dropped acc: 54.90%

Original prompt: In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones:
Attacked prompt: b"In the role of a grammatical profiler, study the language of the provided sentence and decided if it is correct, answer with 'acceptable' for grammatically corrective sentenced or 'unacceptable' for faked ones:"
Original acc: 50.70%, attacked acc: 2.90%, dropped acc: 47.80%

## translation

Language: chinese, acc: 41.90%, prompt: As a tool for determining grammatical correctness, please review the following sentences and categorize them grammatically as 'acceptable' or 'unacceptable'.
Language: chinese, acc: 53.80%, prompt: Please evaluate the grammar of the following sentences and mark them as 'acceptable' or 'unacceptable'.
Language: chinese, acc: 33.00%, prompt: Please check that the following sentences are grammatically correct and categorize them as 'acceptable' or 'unacceptable'.
Language: chinese, acc: 54.00%, prompt: Please determine whether the following sentences are 'acceptable' or 'unacceptable' based on grammatical correctness.
Language: chinese, acc: 26.30%, prompt: Please categorize the grammar of the following sentences as 'acceptable' or 'unacceptable'.
Language: chinese, acc: 36.20%, prompt: Please check the grammar of the following sentences and classify them as 'acceptable' or 'unacceptable'.
Language: chinese, acc: 49.40%, prompt: Please mark the following sentences as 'acceptable' or 'unacceptable' according to the grammatical norms.
Language: chinese, acc: 39.10%, prompt: Please assess the grammar of the following sentences and classify them as 'acceptable' or 'unacceptable'.
Language: chinese, acc: 50.30%, prompt: Please mark the following sentences as 'acceptable' or 'unacceptable' for grammatical accuracy.
Language: chinese, acc: 36.20%, prompt: Please check the grammar of the following sentences and classify them as 'acceptable' or 'unacceptable'.
Language: french, acc: 45.10%, prompt: As a tool for assessing semantic relevance, categorize the sentence as 'acceptable' or 'unacceptable' according to its meaning and context.
Language: french, acc: 46.80%, prompt: Determine whether the grammatical structure of the sentence is 'acceptable' or 'unacceptable' by carefully reviewing it.
Language: french, acc: 20.90%, prompt: Judge whether the sentence follows the syntactic rules of the language and classify it as 'acceptable' or 'unacceptable'.
Language: french, acc: 32.80%, prompt: Assess the relevance of the sentence by classifying it as 'acceptable' or 'unacceptable' according to the context in which it is used.
Language: french, acc: 41.10%, prompt: As a semantic classification tool, decide whether the sentence is 'acceptable' or 'unacceptable' according to its meaning and consistency.
Language: french, acc: 35.30%, prompt: Analyze the sentence carefully and determine whether it is grammatically 'acceptable' or 'unacceptable'.
Language: french, acc: 39.60%, prompt: As a syntactic validation tool, determine whether the sentence is properly structured and categorize it as 'acceptable' or 'unacceptable'.
Language: french, acc: 46.30%, prompt: When evaluating the grammar of the sentence, classify it as 'acceptable' or 'unacceptable' according to the rules of the French language.
Language: french, acc: 41.40%, prompt: Determine whether the sentence is 'acceptable' or 'unacceptable' depending on its syntactic construction and semantic context.
Language: french, acc: 39.80%, prompt: As a tool for linguistic analysis, categorize the sentence as 'acceptable' or 'unacceptable' according to its relevance and semantic consistency.
Language: arabic, acc: 44.70%, prompt: As a tool for determining grammatical correctness, review the sentence and classify its rules as 'acceptable' or 'unacceptable'.
Language: arabic, acc: 45.00%, prompt: Analyze the sentence carefully and classify its grammar between 'acceptable' or 'unacceptable'.
Language: arabic, acc: 39.30%, prompt: As a tool for determining grammatical correctness, classify the rules of the sentence between 'acceptable' or 'unacceptable'.
Language: arabic, acc: 36.90%, prompt: Evaluate the correctness of the sentence between 'acceptable' or 'unacceptable', as a tool for determining grammatical correctness.
Language: arabic, acc: 42.30%, prompt: As a tool for classifying valid sentences, determine whether  'acceptable' or 'unacceptable' is based on its grammatical rules.
Language: arabic, acc: 45.20%, prompt: Analyze the sentence carefully and classify its grammatical rules between 'acceptable' or 'unacceptable'.
Language: arabic, acc: 61.50%, prompt: Analyze the sentence and classify it between 'acceptable' or 'unacceptable' as a grammatical check tool.
Language: arabic, acc: 46.40%, prompt: As a classification tool for grammatical sentences, determine whether the sentence 'acceptable' or 'unacceptable' is based on its rules.
Language: arabic, acc: 38.30%, prompt: As a tool for analyzing grammar, classify the rules of the sentence between 'acceptable' or 'unacceptable'.
Language: arabic, acc: 55.90%, prompt: Determine whether the sentence is 'acceptable' or 'unacceptable' as a tool for determining grammatical correctness and analyze its grammar.
Language: spanish, acc: 41.30%, prompt: As a tool to determine grammatical correctness, review the sentence and categorize its grammar as 'acceptable' or 'unacceptable'.
Language: spanish, acc: 47.30%, prompt: Analyze the sentence carefully and classify its grammar as 'acceptable' or 'unacceptable'.
Language: spanish, acc: 6.40%, prompt: As a tool for determining grammatical correctness, it categorizes the grammatical rules of the sentence as 'acceptable' or 'unacceptable'.
Language: spanish, acc: 52.20%, prompt: Evaluate the grammatical correctness of the sentence and classify it as 'acceptable' or 'unacceptable' using a grammatical verification tool.
Language: spanish, acc: 49.20%, prompt: As a tool for classifying grammatically correct sentences, determine whether the sentence is 'acceptable' or 'unacceptable' based on its grammatical rules.
Language: spanish, acc: 52.90%, prompt: Analyse the sentence carefully and classify its grammar as 'acceptable' or 'unacceptable' using a grammatical revision tool.
Language: spanish, acc: 9.60%, prompt: As a tool for classifying grammatical sentences, it determines whether the sentence is 'acceptable' or 'unacceptable' using its grammatical structure.
Language: spanish, acc: 5.80%, prompt: As a tool for analyzing grammatical correctness, it categorizes the grammatical rules of the sentence as 'acceptable' or 'unacceptable'.
Language: spanish, acc: 40.70%, prompt: Determine whether the sentence is 'acceptable' or 'unacceptable' using a grammatical verification tool and sort it accordingly.
Language: spanish, acc: 57.70%, prompt: As a tool for determining grammatical correctness, evaluate the sentence and classify it as 'acceptable' or 'unacceptable' based on its grammatical rules.
Language: japanese, acc: 36.10%, prompt: As a tool to determine whether grammar is grammatically correct, look at the sentence and categorize grammar into the 'acceptable' or 'unacceptable' categories.
Language: japanese, acc: 38.60%, prompt: Please read the given sentence and categorize the grammar into the 'acceptable' or 'unacceptable' categories.
Language: japanese, acc: 38.20%, prompt: To determine the correctness of a sentence, categorize grammar into two categories: 'acceptable' or 'unacceptable'.
Language: japanese, acc: 23.30%, prompt: Classify the grammar of a given sentence into two categories: 'acceptable' and 'unacceptable'.
Language: japanese, acc: 42.90%, prompt: Classify a given sentence into the categories 'acceptable' or 'unacceptable' to determine whether it is grammatically correct.
Language: japanese, acc: 22.00%, prompt: To determine whether it is grammatically correct, categorize a given sentence into two categories: 'acceptable' or 'unacceptable'.
Language: japanese, acc: 38.10%, prompt: To determine the correctness of grammar, categorize a given sentence into two categories: 'acceptable' or 'unacceptable'.
Language: japanese, acc: 34.50%, prompt: Classify the grammar of a given sentence into two categories, 'acceptable' or 'unacceptable', and judge its accuracy.
Language: japanese, acc: 48.50%, prompt: To determine whether it is grammatically correct, divide a given sentence into two categories: 'acceptable' or 'unacceptable'.
Language: japanese, acc: 25.80%, prompt: To evaluate the accuracy of grammar, categorize a given sentence into two categories: 'acceptable' or 'unacceptable'.
Language: korean, acc: 33.10%, prompt: As a tool for judging grammatical correctness, please review the sentences and classify the grammar as 'acceptable' or 'unacceptable'.
Language: korean, acc: 33.00%, prompt: Please read the given sentences, and classify the grammar as 'acceptable' or 'unacceptable'.
Language: korean, acc: 43.00%, prompt: Please classify the sentences as 'acceptable' or 'unacceptable' to judge the grammatical appropriateness.
Language: korean, acc: 18.80%, prompt: Please classify the grammar of the given sentences into 2 categories: 'acceptable' or 'unacceptable'.
Language: korean, acc: 40.20%, prompt: Please classify the given sentences as 'acceptable' or 'unacceptable' to judge the correctness of the grammar.
Language: korean, acc: 34.90%, prompt: Please classify the given sentences as 'acceptable' or 'unacceptable' in order to judge their grammatical appropriateness.
Language: korean, acc: 46.50%, prompt: Please determine the correctness of your grammar by classifying sentences as 'acceptable' or 'unacceptable'.
Language: korean, acc: 48.00%, prompt: Classify the grammar of a given sentence as 'acceptable' or 'unacceptable', and judge its accuracy.
Language: korean, acc: 40.20%, prompt: Please classify the given sentences as 'acceptable' or 'unacceptable' to judge the correctness of the grammar.
Language: korean, acc: 37.20%, prompt: Please rate the accuracy of your grammar by categorizing sentences as 'acceptable' or 'unacceptable'.

# mnli

## 10 prompts

Acc: 22.50%, prompt: Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction':
Acc: 21.60%, prompt: Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction':
Acc: 19.30%, prompt: Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction':
Acc: 16.40%, prompt: Determine if the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction':
Acc: 16.30%, prompt: Review the two statements and categorize their relationship as either 'entailment', 'neutral', or 'contradiction':
Acc: 13.30%, prompt: Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction':
Acc: 11.80%, prompt: Analyze the two provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction':
Acc: 10.50%, prompt: Considering the two sentences, identify if their relationship is 'entailment', 'neutral', or 'contradiction':
Acc: 3.70%, prompt: Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction':
Acc: 3.70%, prompt: Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':

Acc: 29.20%, prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction':
Acc: 28.40%, prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction':
Acc: 25.00%, prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction':
Acc: 19.90%, prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':
Acc: 19.50%, prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':
Acc: 18.70%, prompt: While performing entailment analysis, classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction':
Acc: 18.30%, prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment', 'neutral', or 'contradiction':
Acc: 17.40%, prompt: In the capacity of an entailment assessment system, indicate if the link between the following sentences is 'entailment', 'neutral', or 'contradiction':
Acc: 12.50%, prompt: As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment', 'neutral', or 'contradiction':
Acc: 11.40%, prompt: As a tool for determining entailment relationships, review the two statements and categorize their connection as either 'entailment', 'neutral', or 'contradiction':

## bertattack

Original prompt: Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Assess the connection between the former expressions and classify it as 'entailment', 'neutral', or 'contradiction':"
Original acc: 22.50%, attacked acc: 3.40%, dropped acc: 19.10%

Original prompt: Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Please classify the relationship between the two pronouns as 'entailment', 'neutral', or 'contradiction':"
Original acc: 21.60%, attacked acc: 3.60%, dropped acc: 18.00%

Original prompt: Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"question if the subject between the below sentences is 'entailment', 'neutral', or 'contradiction':"
Original acc: 19.30%, attacked acc: 0.30%, dropped acc: 19.00%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Functioning as an entailment evaluation aid, study the described discourse and determine if their relationship is 'entailment', 'neutral', or 'contradiction':"
Original acc: 29.20%, attacked acc: 3.30%, dropped acc: 25.90%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"using an entailment identification grammar, approach the connected between the following sentence and question with 'entailment', 'neutral', or 'contradiction':"
Original acc: 28.40%, attacked acc: 0.10%, dropped acc: 28.30%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"s as an entailment classtree, query whether the given pair of sentences denotes entailment, neutral, or contradiction. answer with 'entailment', 'neutral', or 'contradiction':"
Original acc: 25.00%, attacked acc: 0.00%, dropped acc: 25.00%

## checklist

Original prompt: Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction' rzu98kF2KF:"
Original acc: 22.50%, attacked acc: 15.40%, dropped acc: 7.10%

Original prompt: Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction' vTAjHynoIG:"
Original acc: 21.60%, attacked acc: 12.30%, dropped acc: 9.30%

Original prompt: Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction' w52rwgo0Av:"
Original acc: 19.30%, attacked acc: 14.80%, dropped acc: 4.50%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction' w52rwgo0Av:"
Original acc: 29.20%, attacked acc: 15.00%, dropped acc: 14.20%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction' 5yWbBXztUY:"
Original acc: 28.40%, attacked acc: 13.40%, dropped acc: 15.00%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction' vTAjHynoIG:"
Original acc: 25.00%, attacked acc: 7.90%, dropped acc: 17.10%

## deepwordbug

Original prompt: Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction': Attacked prompt: b"Assses the connection between the followinig sentences and classfiy it as 'entailment', 'neutral', or 'contradiction':" Original acc: 22.40%, attacked acc: 14.00%, dropped acc: 8.40%

Original prompt: Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction': Attacked prompt: b"Plesae classify the relationship between the provide sentencBs as 'entailment', 'neutral', or 'contradiction':" Original acc: 21.70%, attacked acc: 11.60%, dropped acc: 10.10%

Original prompt: Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction': Attacked prompt: b"InAdicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction':" Original acc: 19.20%, attacked acc: 8.30%, dropped acc: 10.90%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction': Attacked prompt: b"unctioning as an entailment evaluation tool, analyze the Bprovided sentenves and decBide if their relationship is 'entailment', 'neutral', or 'contradiction':" Original acc: 29.30%, attacked acc: 13.90%, dropped acc: 15.40%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction': Attacked prompt: b"As an entailment identification system, exaOmine the connection between the following senBences and erspond with 'entailment', 'neutral', or 'contradiction':" Original acc: 28.50%, attacked acc: 17.70%, dropped acc: 10.80%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction': Attacked prompt: b"orking as an entailment classifTer, identify whether the Bgiven pair of sentences Yisplays entailment, neutral, or contradiction. gRespond with 'entailment', 'neutral', or 'contradiction':" Original acc: 25.10%, attacked acc: 2.40%, dropped acc: 22.70%

## stresstest

Original prompt: Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction'  and true is true  and true is true  and true is true  and true is true  and true is true :"
Original acc: 22.50%, attacked acc: 24.80%, dropped acc: -2.30%

Original prompt: Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction'  and false is not true :"
Original acc: 21.60%, attacked acc: 28.30%, dropped acc: -6.70%

Original prompt: Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction'  and false is not true :"
Original acc: 19.30%, attacked acc: 26.40%, dropped acc: -7.10%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction'  and false is not true :"
Original acc: 29.20%, attacked acc: 30.80%, dropped acc: -1.60%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction'  and true is true  and true is true  and true is true  and true is true  and true is true :"
Original acc: 28.40%, attacked acc: 33.10%, dropped acc: -4.70%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction'  and true is true  and true is true  and true is true  and true is true  and true is true :"
Original acc: 25.00%, attacked acc: 29.50%, dropped acc: -4.50%

## textbugger

Original prompt: Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Assess the relationship between the later sen\xf0\x9d\x9a\x9dences and classify it as 'entailment', 'neutral', or 'contradiction':"
Original acc: 22.50%, attacked acc: 1.00%, dropped acc: 21.50%

Original prompt: Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"P\xe2\x85\xbcease classify the relationship between the provided sentencing as 'entailment', 'neutral', or 'contradiction':"
Original acc: 21.60%, attacked acc: 4.00%, dropped acc: 17.60%

Original prompt: Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Indica\xf0\x9d\x9a\x9de if the connection between the following sentence is 'entailment', 'neutral', or 'contradiction':"
Original acc: 19.30%, attacked acc: 0.50%, dropped acc: 18.80%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Func\xf0\x9d\x9a\x9dioning as an entailment evaluation \xf0\x9d\x9a\x9dool, analyzing the provided sen\xf0\x9d\x9a\x9dences and decide if their relationship is 'entailment', 'neutral', or 'contradiction':"
Original acc: 29.20%, attacked acc: 1.80%, dropped acc: 27.40%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Because an entailment identification scheme, exa\xef\xbd\x8dine the link between the following sentences and \xe2\xb2\x85espond with 'entailment', 'neutral', or 'contradiction':"
Original acc: 28.40%, attacked acc: 3.70%, dropped acc: 24.70%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Wor\xf0\x9d\x92\x8cing as an entailment classifi\xd0\xb5r, identify whether the given p air of sen\xf0\x9d\x9a\x9dences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction':"
Original acc: 25.00%, attacked acc: 2.70%, dropped acc: 22.30%

## textfooler

Original prompt: Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Assess the correlation between the following verdicts and classify it as 'entailment', 'neutral', or 'contradiction':"
Original acc: 22.50%, attacked acc: 2.90%, dropped acc: 19.60%

Original prompt: Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Ask classify the relationship between the provided priors as 'entailment', 'neutral', or 'contradiction':"
Original acc: 21.60%, attacked acc: 2.00%, dropped acc: 19.60%

Original prompt: Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Outlined if the nexus between the following sentence is 'entailment', 'neutral', or 'contradiction':"
Original acc: 19.30%, attacked acc: 1.00%, dropped acc: 18.30%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Task as an entailment estimation utility, analyze the provided condemning and identify if their relationship is 'entailment', 'neutral', or 'contradiction':"
Original acc: 29.20%, attacked acc: 7.70%, dropped acc: 21.50%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Since an entailment detecting scheme, investigates the relation between the following sentences and answer with 'entailment', 'neutral', or 'contradiction':"
Original acc: 28.40%, attacked acc: 1.20%, dropped acc: 27.20%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction':
Attacked prompt: b"Conjunction as an entailment classifier, locate whether the given doublet of priors denotes entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':"
Original acc: 25.00%, attacked acc: 0.50%, dropped acc: 24.50%

# mrpc

## 10 prompts

Acc: 67.16%, prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'.
Acc: 58.09%, prompt: Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent'.
Acc: 56.86%, prompt: Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent'.
Acc: 48.28%, prompt: Examine the pair of sentences and decide if their meanings are identical by answering with 'equivalent' or 'not_equivalent'.
Acc: 47.06%, prompt: Determine if the meanings of the following sentences are semantically equivalent by responding with 'equivalent' or 'not_equivalent'.
Acc: 43.14%, prompt: Do the meanings of these two statements align? Indicate your answer with 'equivalent' or 'not_equivalent'.
Acc: 38.24%, prompt: Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent'.
Acc: 19.36%, prompt: Evaluate whether the two provided sentences convey the same meaning by answering 'equivalent' or 'not_equivalent'.
Acc: 19.12%, prompt: Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent'.
Acc: 1.96%, prompt: Assess if the two given sentences have equivalent meanings by selecting 'equivalent' or 'not_equivalent'.

Acc: 53.43%, prompt: In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent'.
Acc: 52.21%, prompt: In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent'.
Acc: 39.46%, prompt: In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'.
Acc: 35.54%, prompt: As a linguistic comparator, review the following pair of sentences and determine their semantic equivalence by choosing 'equivalent' or 'not_equivalent'.
Acc: 34.31%, prompt: As a semantic comparison expert, evaluate the given pair of sentences and determine if they are 'equivalent' or 'not_equivalent'.
Acc: 31.86%, prompt: As a language comparison expert, examine the given pair of sentences and decide if their meanings align, answering with 'equivalent' or 'not_equivalent'.
Acc: 18.63%, prompt: As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'.
Acc: 16.67%, prompt: In your capacity as a semantic assessment tool, evaluate the provided sentences and classify their meanings as 'equivalent' or 'not_equivalent'.
Acc: 14.71%, prompt: In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent'.
Acc: 11.52%, prompt: As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent'.

## bertattack

Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Are the meanings of the next in of sentences the same? q with 'equivalent' or 'not_equivalent'. "
Original acc: 67.16%, attacked acc: 0.00%, dropped acc: 67.16%

Original prompt: Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Compare the equivalent sentences and confirm if they share the same social meaning by responding with 'equivalent' or 'not_equivalent'. "
Original acc: 58.09%, attacked acc: 0.00%, dropped acc: 58.09%

Original prompt: Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Do these two sentences have the same underlying meaning? object with 'equivalent' or 'not_equivalent'. "
Original acc: 56.86%, attacked acc: 0.00%, dropped acc: 56.86%

Original prompt: In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent'.
Attacked prompt: b"at the instance of a sentence comparative analyst, score the supplied sentences and indicate if they signal the same meaning by being 'equivalent' or 'not_equivalent'. "
Original acc: 53.43%, attacked acc: 1.47%, dropped acc: 51.96%

Original prompt: In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"[UNK] your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are different, responding with 'equivalent' or 'not_equivalent'. "
Original acc: 52.21%, attacked acc: 0.00%, dropped acc: 52.21%

Original prompt: In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"into the career of a textual comparison specialist, examine the given sentences and rank if they share the same meaning, matching with 'equivalent' or 'not_equivalent'. "
Original acc: 39.46%, attacked acc: 0.00%, dropped acc: 39.46%

## checklist

Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent' vTAjHynoIG. "
Original acc: 67.16%, attacked acc: 29.17%, dropped acc: 37.99%

Original prompt: Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent' AMsRIKZniY. "
Original acc: 58.09%, attacked acc: 12.50%, dropped acc: 45.59%

Original prompt: Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent' vTAjHynoIG. "
Original acc: 56.86%, attacked acc: 9.31%, dropped acc: 47.55%

Original prompt: In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent'.
Attacked prompt: b"In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent' Cev0s49fbe. "
Original acc: 53.43%, attacked acc: 0.25%, dropped acc: 53.19%

Original prompt: In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent' AMsRIKZniY. "
Original acc: 52.21%, attacked acc: 6.62%, dropped acc: 45.59%

Original prompt: In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent' MQnugHcaoy. "
Original acc: 39.46%, attacked acc: 5.64%, dropped acc: 33.82%

## deepwordbug

Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"tre the meXanings of the Uollowing paOir of sentences the same? AnsQer with 'equivalent' or 'not_equivalent'. "
Original acc: 67.16%, attacked acc: 6.37%, dropped acc: 60.78%

Original prompt: Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"oCmpare the follAwing sentewces and determine if they share the same swmantic meaning by responding with 'equivalent' or 'not_equivalent'. "
Original acc: 57.84%, attacked acc: 0.00%, dropped acc: 57.84%

Original prompt: Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"DAo these otwo sentecnes have the same underlyCng mLeaning? Respod with 'equivalent' or 'not_equivalent'. "
Original acc: 57.35%, attacked acc: 2.21%, dropped acc: 55.15%

Original prompt: In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent'.
Attacked prompt: b"In the roOe of a sentence comparison anaeyst, assss the providde sentences and indicate if they cSonvey the same meaning by sDelecting 'equivalent' or 'not_equivalent'. "
Original acc: 52.94%, attacked acc: 4.17%, dropped acc: 48.77%

Original prompt: In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"IYn your capaciMy as a semantic comparison ool, examine the given sentences and deXide if their meanings are identRcal, rseponding with 'equivalent' or 'not_equivalent'. "
Original acc: 52.21%, attacked acc: 1.47%, dropped acc: 50.74%

Original prompt: In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Io the role of a textual comparison specialist, examine the givIen sentences and decide if they share the same meannig, renponding with 'equivalent' or 'not_equivalent'. "
Original acc: 39.46%, attacked acc: 5.15%, dropped acc: 34.31%

## stresstest

Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 67.16%, attacked acc: 67.89%, dropped acc: -0.74%

Original prompt: Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 58.09%, attacked acc: 56.37%, dropped acc: 1.72%

Original prompt: Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 56.86%, attacked acc: 58.09%, dropped acc: -1.23%

Original prompt: In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent'.
Attacked prompt: b"In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 53.43%, attacked acc: 47.79%, dropped acc: 5.64%

Original prompt: In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 52.21%, attacked acc: 33.33%, dropped acc: 18.87%

Original prompt: In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 39.46%, attacked acc: 38.24%, dropped acc: 1.23%

## textbugger

Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Be the meanings of the \xf0\x9d\x9a\x8following pair of sentences the same? Responding with 'equivalent' or 'not_equivalent'. "
Original acc: 67.16%, attacked acc: 1.96%, dropped acc: 65.20%

Original prompt: Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Com\xd1\x80are the later sentences and determine if they share the same semantic meaning by respo\xd5\xb8ding with 'equivalent' or 'not_equivalent'. "
Original acc: 58.09%, attacked acc: 0.49%, dropped acc: 57.60%

Original prompt: Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Could these two sentences have the same underlying connotation? Cater with 'equivalent' or 'not_equivalent'. "
Original acc: 56.86%, attacked acc: 1.47%, dropped acc: 55.39%

Original prompt: In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent'.
Attacked prompt: b"I\xd5\xb8 the rloe of a sentence comparison analyst, asess the provided sentences and indicate if they convey the same meaning by selec\xf0\x9d\x9a\x9ding 'equivalent' or 'not_equivalent'. "
Original acc: 53.43%, attacked acc: 0.00%, dropped acc: 53.43%

Original prompt: In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"At your capacity as a seman\xf0\x9d\x9a\x9dic com\xd1\x80arison tool, examine the given sentences and decide if their meanings are identical, respo\xd5\xb8ding with 'equivalent' or 'not_equivalent'. "
Original acc: 52.21%, attacked acc: 0.00%, dropped acc: 52.21%

Original prompt: In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"I\xd5\xb8 the role of a textual comparison specialist, examine the given sen\xf0\x9d\x9a\x9dences and decide if they share the same meaning, respondi\xd5\xb8g with 'equivalent' or 'not_equivalent'. "
Original acc: 39.46%, attacked acc: 0.00%, dropped acc: 39.46%

## textfooler

Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Are the meanings of the following couple of sentences the same? Cope with 'equivalent' or 'not_equivalent'. "
Original acc: 67.16%, attacked acc: 0.49%, dropped acc: 66.67%

Original prompt: Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Compare the suite sentences and identify if they share the same semantic meanings by satisfy with 'equivalent' or 'not_equivalent'. "
Original acc: 58.09%, attacked acc: 1.23%, dropped acc: 56.86%

Original prompt: Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Do these two sentences have the same nucleus connotation? Reacts with 'equivalent' or 'not_equivalent'. "
Original acc: 56.86%, attacked acc: 1.23%, dropped acc: 55.64%

Original prompt: In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent'.
Attacked prompt: b"About the role of a sentence comparison commentator, assess the presented judgements and indicate if they convey the same meaning by option 'equivalent' or 'not_equivalent'. "
Original acc: 53.43%, attacked acc: 2.21%, dropped acc: 51.23%

Original prompt: In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Per your powers as a semantic comparison paraphernalia, consider the given sentence and decide if their senses are identical, cope with 'equivalent' or 'not_equivalent'. "
Original acc: 52.21%, attacked acc: 6.13%, dropped acc: 46.08%

Original prompt: In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"About the role of a textual comparison specialist, examine the given judgements and decide if they exchanging the same meaning, responses with 'equivalent' or 'not_equivalent'. "
Original acc: 39.46%, attacked acc: 0.00%, dropped acc: 39.46%

## translation

Language: chinese, acc: 17.16%, prompt: As an expert in semantic comparison, evaluates two given sentences to determine whether they are 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 16.18%, prompt: From the point of view of semantic comparison, determine whether this pair of sentences is 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 8.58%, prompt: Using the semantic comparison method, determine whether the following two statements are 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 1.96%, prompt: For the following two sentences, determine whether they are 'equivalent' or 'not_equivalent' based on semantic comparison.
Language: chinese, acc: 2.45%, prompt: As an expert in semantic comparison, please evaluate the following two sentences and determine if they are 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 1.96%, prompt: Using semantic comparison techniques, determine whether the following two sentences are 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 0.74%, prompt: Please determine whether the following two sentences are 'equivalent' or 'not_equivalent' according to the standard of semantic comparison.
Language: chinese, acc: 2.45%, prompt: As an expert in the field of semantic comparison, please evaluate the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 0.74%, prompt: Using semantic comparison, determine whether the following two sentences are 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 1.47%, prompt: Determine whether the following two sentences are 'equivalent' or 'not_equivalent' based on semantic comparison.
Language: french, acc: 18.87%, prompt: As an expert in semantic comparison, evaluate the following pair of sentences and determine whether they are 'equivalent' or 'not_equivalent'.
Language: french, acc: 28.19%, prompt: Can you determine whether the following two sentences are 'equivalent' or 'not_equivalent' as a semantic comparison expert?
Language: french, acc: 6.13%, prompt: Using your expertise in semantic comparison, determine whether the following two sentences are 'equivalent' or 'not_equivalent'.
Language: french, acc: 15.20%, prompt: As a semantic comparison specialist, assess the similarity between the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'.
Language: french, acc: 4.90%, prompt: Are you able to determine whether the following two sentences are 'equivalent' or 'not_equivalent' as an expert in semantic comparison?
Language: french, acc: 20.34%, prompt: As a semantic comparison professional, evaluate the following pair of sentences and indicate whether they are 'equivalent' or 'not_equivalent'.
Language: french, acc: 15.93%, prompt: Can you determine whether the following two sentences have a 'equivalent' or 'not_equivalent' meaning as an expert in semantic comparison?
Language: french, acc: 29.90%, prompt: As an expert in semantic comparison, assess the similarity between the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'.
Language: french, acc: 17.89%, prompt: Using your expertise in semantic comparison, determine whether the following two sentences are 'equivalent' or 'not_equivalent' in terms of meaning.
Language: french, acc: 7.60%, prompt: As a semantic comparison professional, assess the similarity between the following two sentences and indicate whether they are 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 15.69%, prompt: As an expert in semantic comparison, evaluate the two given sentences and determine whether they are 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 8.33%, prompt: Based on my experience in semantic analysis, classify the following two sentences as 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 8.58%, prompt: As an expert in semantic comparison, analyze the following two sentences and classify them as 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 29.90%, prompt: Your task as an expert in semantic comparison is to evaluate the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 13.73%, prompt: As a semantic comparison specialist, analyze the two data statements and insert them into one of the following categories: 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 24.75%, prompt: Based on my experience in semantic analysis, classify the following two sentences between 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 12.01%, prompt: Your role as a semantic comparison specialist requires analyzing the two given sentences and determining whether they are 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 6.13%, prompt: As an experienced semantic analyst, classify the following two sentences as 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 6.37%, prompt: Your job as a semantic analyst evaluates the following two sentences as 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 14.22%, prompt: As a semantic analyst, determine whether the given sentences are 'equivalent' or 'not_equivalent' based on their relationship.
Language: spanish, acc: 15.44%, prompt: As an expert in semantic comparison, it evaluates the pair of sentences provided and determines whether they are 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 8.33%, prompt: Based on my experience in semantic analysis, classify the following two sentences as 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 23.28%, prompt: As an expert in semantic comparison, analyze the two sentences given and classify them as 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 19.61%, prompt: Your task as a semantic comparison specialist is to evaluate the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 26.96%, prompt: As an expert in semantic analysis, he makes a classification of the following two sentences based on their 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 9.80%, prompt: Based on your experience of semantic comparison, classify the next two sentences as 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 9.07%, prompt: As a specialist in semantic analysis, you are given the task of analysing the two sentences given and classifying them as 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 2.21%, prompt: As an expert in semantic comparison, he classifies the following two sentences into 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 8.58%, prompt: As a specialist in semantic analysis, evaluate the following two sentences and classify them as 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 38.48%, prompt: Your task as an expert in semantic comparison is to analyze the two sentences provided and determine whether they are 'equivalent' or 'not_equivalent' based on their semantic relationship.
Language: japanese, acc: 16.18%, prompt: Evaluate whether a given pair of sentences is 'equivalent' or 'not_equivalent', depending on the context.
Language: japanese, acc: 16.67%, prompt: Use a semantic comparison to determine whether a given pair of sentences is 'equivalent' or 'not_equivalent'.
Language: japanese, acc: 4.17%, prompt: Evaluate a given pair of sentences as 'equivalent' or 'not_equivalent' by determining whether they have the same semantic meaning.
Language: japanese, acc: 59.80%, prompt: Determine whether a given pair of sentences is synonyms and evaluate whether they are 'equivalent' or 'not_equivalent'.
Language: japanese, acc: 30.15%, prompt: Determine whether a given pair of sentences is 'equivalent' or 'not_equivalent', and whether they are semantically identical.
Language: japanese, acc: 54.17%, prompt: Determinate whether a given pair of sentences has the same meaning and evaluate whether they are 'equivalent' or 'not_equivalent'.
Language: japanese, acc: 9.80%, prompt: Evaluate whether a given pair of sentences is 'equivalent' or 'not_equivalent' by determining whether they are semantically identical.
Language: japanese, acc: 39.95%, prompt: Judge whether a given pair of sentences is equal and evaluate whether they are 'equivalent' or 'not_equivalent'.
Language: japanese, acc: 51.23%, prompt: Determinate whether a given pair of sentences are semantically equal and evaluate whether they are 'equivalent' or 'not_equivalent'.
Language: japanese, acc: 10.05%, prompt: Whether a given pair of sentences is 'equivalent' or 'not_equivalent' depends on the context.
Language: korean, acc: 25.00%, prompt: As a sentence comparator, evaluate the two sentences given to determine 'equivalent' or 'not_equivalent'.
Language: korean, acc: 9.56%, prompt: Compare two sentences to determine 'equivalent' or 'not_equivalent'. For this you need qualifications as a specialist in semantic comparison.
Language: korean, acc: 4.41%, prompt: It takes your knowledge as an expert in semantic comparison to determine that two sentences are 'equivalent' or 'not_equivalent'.
Language: korean, acc: 29.17%, prompt: As a specialist in semantic comparison, evaluate whether two given sentences are 'equivalent' or 'not_equivalent'.
Language: korean, acc: 38.48%, prompt: Analyze two sentences to determine 'equivalent' or 'not_equivalent'. For that you need the knowledge of a semantic comparison expert.
Language: korean, acc: 18.63%, prompt: As an expert in semantic comparison, decide whether two sentences are 'equivalent' or 'not_equivalent'.
Language: korean, acc: 9.07%, prompt: It takes the knowledge of an expert in semantic comparison to compare two sentences to judge 'equivalent' or 'not_equivalent'.
Language: korean, acc: 18.87%, prompt: Experience as an expert in semantic comparison is required to determine whether two given sentences are 'equivalent' or 'not_equivalent'.
Language: korean, acc: 23.28%, prompt: As an expert in semantic comparison, determine whether two sentences are 'equivalent' or 'not_equivalent'.
Language: korean, acc: 19.85%, prompt: Analyze two sentences to determine 'equivalent' or 'not_equivalent'. For this, you need a qualification as a specialist in semantic comparison.

# qnli

## 10 prompts

Acc: 37.90%, prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'.
Acc: 30.90%, prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'.
Acc: 30.50%, prompt: Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'.
Acc: 30.10%, prompt: Consider the context and question, and indicate if the answer can be logically deduced from the context by responding with 'entailment' or 'not_entailment'.
Acc: 29.30%, prompt: Based on the information in the context, decide if the answer to the question is justified by choosing 'entailment' or 'not_entailment'.
Acc: 26.50%, prompt: Please assess if the answer to the question can be derived from the given context by selecting 'entailment' or 'not_entailment'.
Acc: 10.50%, prompt: Given the question and context provided, determine if the answer can be inferred by choosing 'entailment' or 'not_entailment'.
Acc: 10.20%, prompt: Assess if the answer to the question can be logically concluded from the provided context by choosing 'entailment' or 'not_entailment'.
Acc: 10.00%, prompt: Examine the context and question, and determine if the context logically implies the answer by selecting 'entailment' or 'not_entailment'.
Acc: 7.60%, prompt: Analyze the context and question, and indicate if the context entails the answer by choosing 'entailment' or 'not_entailment'.

Acc: 38.40%, prompt: As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment'.
Acc: 37.30%, prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'.
Acc: 35.30%, prompt: As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'.
Acc: 33.90%, prompt: As a linguistic consultant, decide if the answer to the question is logically supported by the provided context and respond with 'entailment' or 'not_entailment'.
Acc: 27.90%, prompt: In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'.
Acc: 26.40%, prompt: As a textual analyst, examine if the given context logically implies the answer to the question and indicate your decision with 'entailment' or 'not_entailment'.
Acc: 23.40%, prompt: In the capacity of a language specialist, decide if the context presented contains enough information to infer the answer to the question and respond with 'entailment' or 'not_entailment'.
Acc: 20.10%, prompt: As a semantic interpreter, assess whether the provided context supports the answer to the given question and answer with 'entailment' or 'not_entailment'.
Acc: 16.60%, prompt: As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment'.
Acc: 7.30%, prompt: In the capacity of a language evaluator, examine if the given context justifies the answer to the question and indicate your assessment with 'entailment' or 'not_entailment'.

## bertattack

Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"and whether the following context supports the answer to the question by responses with 'entailment' or 'not_entailment'. "
Original acc: 37.90%, attacked acc: 1.60%, dropped acc: 36.30%

Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"1 on the provided subject and question, decided if the information supports the response by responses with 'entailment' or 'not_entailment'. "
Original acc: 30.90%, attacked acc: 4.40%, dropped acc: 26.50%

Original prompt: Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'.
Attacked prompt: b"Review the given context and question, and decide if the context requires enough information to let the answer by choices 'entailment' or 'not_entailment'. "
Original acc: 30.50%, attacked acc: 0.00%, dropped acc: 30.50%

Original prompt: As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment'.
Attacked prompt: b"example a textual in expert, analyzing if the answer to the question can be deduced from the language context and think 'entailment' or 'not_entailment'. "
Original acc: 38.40%, attacked acc: 0.40%, dropped acc: 38.00%

Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'.
Attacked prompt: b"In your role as a linguistic investigator, tell if the conversation given outis the answer to the question and provide your consensus with 'entailment' or 'not_entailment'. "
Original acc: 37.30%, attacked acc: 0.60%, dropped acc: 36.70%

Original prompt: As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"As a language expert, assessed if the context citation enproduces the answers to the question and approach with 'entailment' or 'not_entailment'. "
Original acc: 35.30%, attacked acc: 0.40%, dropped acc: 34.90%

## checklist

Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment' rzu98kF2KF. "
Original acc: 37.90%, attacked acc: 3.40%, dropped acc: 34.50%

Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment' 9vGXgnbWB8. "
Original acc: 30.90%, attacked acc: 3.20%, dropped acc: 27.70%

Original prompt: Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'.
Attacked prompt: b"Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment' 9vGXgnbWB8. "
Original acc: 30.50%, attacked acc: 2.90%, dropped acc: 27.60%

Original prompt: As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment'.
Attacked prompt: b"As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment' 9vGXgnbWB8. "
Original acc: 38.40%, attacked acc: 10.80%, dropped acc: 27.60%

Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'.
Attacked prompt: b"In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment' SvAp8RlOFn. "
Original acc: 37.30%, attacked acc: 16.10%, dropped acc: 21.20%

Original prompt: As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment' RcRneWo6Iv. "
Original acc: 35.30%, attacked acc: 5.20%, dropped acc: 30.10%

## deepwordbug

Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"EvaluaDe whether the igven context suppors the ansewr to the quHstion by respoding with 'entailment' or 'not_entailment'. "
Original acc: 38.10%, attacked acc: 6.90%, dropped acc: 31.20%

Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"Based on the povided ocntext and quetion, decide if the inofrmation supports the answer by responding with 'entailment' or 'not_entailment'. "
Original acc: 30.80%, attacked acc: 12.90%, dropped acc: 17.90%

Original prompt: Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'.
Attacked prompt: b"Rview the gven cnotext and question, and edcide if the context containZs enough information to Vsupport the ainswer by selecting 'entailment' or 'not_entailment'. "
Original acc: 30.50%, attacked acc: 4.70%, dropped acc: 25.80%

Original prompt: As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment'.
Attacked prompt: b"fs a textual inference xpert, anaRyze if the answer to the question can be deduced from the provided cotnext and yselect 'entailment' or 'not_entailment'. "
Original acc: 38.50%, attacked acc: 6.20%, dropped acc: 32.30%

Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'.
Attacked prompt: b"In your role as a lignuistic investigator, determine if the cnotext given entavils the anwser to the question and proide your conclusion with 'entailment' or 'not_entailment'. "
Original acc: 37.30%, attacked acc: 5.20%, dropped acc: 32.10%

Original prompt: As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"As a language epxert, arssess if the gien contet entaxls the ansGer to the questAon and resNpond with 'entailment' or 'not_entailment'. "
Original acc: 35.30%, attacked acc: 2.10%, dropped acc: 33.20%

## stresstest

Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 37.90%, attacked acc: 19.80%, dropped acc: 18.10%

Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 30.90%, attacked acc: 16.20%, dropped acc: 14.70%

Original prompt: Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'.
Attacked prompt: b"Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 30.50%, attacked acc: 33.70%, dropped acc: -3.20%

Original prompt: As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment'.
Attacked prompt: b"As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 38.40%, attacked acc: 29.50%, dropped acc: 8.90%

Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'.
Attacked prompt: b"In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'  and true is true . "
Original acc: 37.30%, attacked acc: 28.00%, dropped acc: 9.30%

Original prompt: As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 35.30%, attacked acc: 19.80%, dropped acc: 15.50%

## textbugger

Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"E\xd1\xb5aluate whether the give\xd5\xb8 con\xf0\x9d\x9a\x9dext supports the answer to the question by responding with 'entailment' or 'not_entailment'. "
Original acc: 37.90%, attacked acc: 0.50%, dropped acc: 37.40%

Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"Ba\xd1\x95ed on the provided context and question, deciding if the particulars suppor\xf0\x9d\x9a\x9ds the answer by responds with 'entailment' or 'not_entailment'. "
Original acc: 30.90%, attacked acc: 0.40%, dropped acc: 30.50%

Original prompt: Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'.
Attacked prompt: b"Revisited the afforded cont\xd0\xb5xt and question, and decide if the context contains enough information to support the answer by selec\xf0\x9d\x9a\x9ding 'entailment' or 'not_entailment'. "
Original acc: 30.50%, attacked acc: 0.00%, dropped acc: 30.50%

Original prompt: As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment'.
Attacked prompt: b"A\xd1\x95 a textual deduced expe\xe2\xb2\x85t, analyzes if the replying to the question can be deduced from the provided context and selec\xf0\x9d\x9a\x9d 'entailment' or 'not_entailment'. "
Original acc: 38.40%, attacked acc: 0.10%, dropped acc: 38.30%

Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'.
Attacked prompt: b"In your \xe2\xb2\x85ole as a linguistic researcher, determine if the con\xf0\x9d\x9a\x9dext yielded entai\xe2\x85\xbcs the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. "
Original acc: 37.30%, attacked acc: 0.30%, dropped acc: 37.00%

Original prompt: As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"A\xd1\x95 a linguistic ex\xd1\x80ert, assessing if the given con\xf0\x9d\x9a\x9dext implies the answer to the question and responding with 'entailment' or 'not_entailment'. "
Original acc: 35.30%, attacked acc: 0.10%, dropped acc: 35.20%

## textfooler

Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"Weighed whether the given context assistance the reaction to the question by answered with 'entailment' or 'not_entailment'. "
Original acc: 37.90%, attacked acc: 1.00%, dropped acc: 36.90%

Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'.
Attacked prompt: b"Based on the presented framework and question, decide if the info supports the reacting by meet with 'entailment' or 'not_entailment'. "
Original acc: 30.90%, attacked acc: 1.60%, dropped acc: 29.30%

Original prompt: Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'.
Attacked prompt: b"Reconsidered the given context and question, and ascertain if the context comprises enough information to favour the answer by choices 'entailment' or 'not_entailment'. "
Original acc: 30.50%, attacked acc: 0.00%, dropped acc: 30.50%

Original prompt: As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment'.
Attacked prompt: b"Because a textual inference practitioner, analysed if the reaction to the subject can be deduced from the awarded context and click 'entailment' or 'not_entailment'. "
Original acc: 38.40%, attacked acc: 1.20%, dropped acc: 37.20%

Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'.
Attacked prompt: b"Among your features as a linguistic fellows, define if the ambient yielded translates the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. "
Original acc: 37.30%, attacked acc: 0.90%, dropped acc: 36.40%

Original prompt: As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"As a language expert, weighed if the delivered framework translates the answer to the question and respondent with 'entailment' or 'not_entailment'. "
Original acc: 35.30%, attacked acc: 0.60%, dropped acc: 34.70%

## translation

Language: chinese, acc: 29.30%, prompt: Determine whether the answer is implied or not based on the context. Answer 'entailment' or 'not_entailment'.
Language: chinese, acc: 12.80%, prompt: As a language expert, determine whether the text contains the answer and choose 'entailment' or 'not_entailment'.
Language: chinese, acc: 19.30%, prompt: Determine whether the text implies an answer, and select 'entailment' or 'not_entailment'.
Language: chinese, acc: 35.90%, prompt: Given the text and the question, whether the answer is implied, select 'entailment' or 'not_entailment'.
Language: chinese, acc: 16.60%, prompt: Determine whether the text contains the answer, depending on the context. Select 'entailment' or 'not_entailment'.
Language: chinese, acc: 25.80%, prompt: As a language expert, determine whether a text can contain an answer, and say 'entailment' or 'not_entailment'.
Language: chinese, acc: 40.30%, prompt: Please determine whether the text implies an answer. Answer 'entailment' or 'not_entailment'.
Language: chinese, acc: 20.00%, prompt: Please select 'entailment' or 'not_entailment' based on the text and the question.
Language: chinese, acc: 33.90%, prompt: Assess whether the answer is implied based on the context. Answer 'entailment' or 'not_entailment'.
Language: chinese, acc: 40.90%, prompt: Please determine whether the text contains the answer and answer 'entailment' or 'not_entailment'.
Language: french, acc: 43.90%, prompt: As a linguistic expert, assess whether the given context involves the answer to the question and answer with 'entailment' or 'not_entailment'.
Language: french, acc: 2.40%, prompt: Determine whether the information provided in the context necessarily leads to the answer to the question asked and indicate 'entailment' or 'not_entailment'.
Language: french, acc: 11.50%, prompt: Analyze the text to determine if the answer to the question is implied in the context and specify 'entailment' or 'not_entailment'.
Language: french, acc: 40.00%, prompt: Based on the given context, decide whether the answer to the question is necessarily involved and mark 'entailment' or 'not_entailment'.
Language: french, acc: 14.40%, prompt: Evaluate whether the answer to the question can be deduced from the given context and mark 'entailment' or 'not_entailment'.
Language: french, acc: 23.20%, prompt: Discern whether the context provided directly involves the answer to the question and indicate 'entailment' or 'not_entailment'.
Language: french, acc: 26.60%, prompt: Determine if the context contains enough information to involve the answer to the question and mark 'entailment' or 'not_entailment'.
Language: french, acc: 11.20%, prompt: Assess whether the context provided necessarily leads to the answer to the question and answer with 'entailment' or 'not_entailment'.
Language: french, acc: 17.10%, prompt: Analyze the text to determine if the answer to the question is involved in the context and indicate 'entailment' or 'not_entailment'.
Language: french, acc: 30.90%, prompt: Based on the given context, decide whether the answer to the question is necessarily inferred and mark 'entailment' or 'not_entailment'.
Language: arabic, acc: 32.90%, prompt: As a language expert, evaluate whether the given context calls for an answer and answer 'entailment' or 'not_entailment'.
Language: arabic, acc: 8.70%, prompt: Judge the relationship between the text and the question and answer 'entailment' or 'not_entailment', depending on your language experience.
Language: arabic, acc: 12.30%, prompt: Does the context given indicate the answer to the question? Evaluate and answer 'entailment' or 'not_entailment'.
Language: arabic, acc: 44.40%, prompt: Based on your linguistic knowledge, does the text relate to the question? Answer 'entailment' or 'not_entailment'.
Language: arabic, acc: 5.10%, prompt: As a language expert, determine how the text relates to the question and answer 'entailment' or 'not_entailment'.
Language: arabic, acc: 27.40%, prompt: Does the text support the answer to the question? Answer 'entailment' or 'not_entailment', depending on your language experience.
Language: arabic, acc: 6.90%, prompt: Check the text link to the question and answer 'entailment' or 'not_entailment', depending on your language skills.
Language: arabic, acc: 34.30%, prompt: As a language expert, is there a link between the text and the question? Answer 'entailment' or 'not_entailment'.
Language: arabic, acc: 7.50%, prompt: Based on your language experience, does context help to answer the question? Evaluate and answer 'entailment' or 'not_entailment'.
Language: arabic, acc: 20.80%, prompt: Does the text give a clear answer to the question? Answer 'entailment' or 'not_entailment', depending on your language experience.
Language: spanish, acc: 39.40%, prompt: As a language expert, evaluate whether the given context implies the answer to the question and answer with 'entailment' or 'not_entailment'.
Language: spanish, acc: 27.40%, prompt: Determine whether the information given in the text necessarily implies the veracity of the hypothesis and answer 'entailment' or 'not_entailment'.
Language: spanish, acc: 4.40%, prompt: Analyzes whether the information presented in the paragraph leads to the conclusion of the question and labels the answer as 'entailment' or 'not_entailment'.
Language: spanish, acc: 5.20%, prompt: Indicates whether the information provided in the text is sufficient to conclude the statement and labels the response as 'entailment' or 'not_entailment'.
Language: spanish, acc: 24.80%, prompt: As an expert on the subject, judge whether the information provided in the text justifies the claim and classify the answer as 'entailment' or 'not_entailment'.
Language: spanish, acc: 10.90%, prompt: Evaluates whether the information in the paragraph necessarily supports the conclusion of the hypothesis and responds 'entailment' or 'not_entailment'.
Language: spanish, acc: 4.90%, prompt: Determines whether the information presented in the text logically implies the answer to the question and labels the answer as 'entailment' or 'not_entailment'.
Language: spanish, acc: 5.10%, prompt: Analyzes whether the information provided in the paragraph necessarily leads to the veracity of the hypothesis and classifies the response as 'entailment' or 'not_entailment'.
Language: spanish, acc: 27.00%, prompt: As an expert on the subject, evaluate whether the information presented in the text supports the claim and respond 'entailment' or 'not_entailment'.
Language: spanish, acc: 5.30%, prompt: Indicates whether the information provided in the paragraph necessarily implies the answer to the question and labels the answer as 'entailment' or 'not_entailment'.
Language: japanese, acc: 42.60%, prompt: Rate whether the answer to the question is derived from the given context and answer with 'entailment' or 'not_entailment'.
Language: japanese, acc: 18.10%, prompt: Please answer 'entailment' or 'not_entailment' for the given context and question.
Language: japanese, acc: 42.20%, prompt: Decide whether the answer to the question is derived from the given context and answer 'entailment' or 'not_entailment'.
Language: japanese, acc: 18.90%, prompt: Compare the question with the given context and give the answer 'entailment' or 'not_entailment'.
Language: japanese, acc: 28.90%, prompt: Determinate whether the given context contains the answer to the question and answer with 'entailment' or 'not_entailment'.
Language: japanese, acc: 29.80%, prompt: Estimate the answer of the question from the context and give the answer 'entailment' or 'not_entailment'.
Language: japanese, acc: 22.90%, prompt: Determinate whether the given context is relevant to the question and answer with 'entailment' or 'not_entailment'.
Language: japanese, acc: 27.40%, prompt: Determine whether the given context is relevant to the question and answer with 'entailment' or 'not_entailment'.
Language: japanese, acc: 27.90%, prompt: Determinate whether the given context contains the answer to the question and answer 'entailment' or 'not_entailment'.
Language: japanese, acc: 18.00%, prompt: Answer with 'entailment' or 'not_entailment', inferring from the given context.
Language: korean, acc: 39.60%, prompt: Determine if a given sentence necessarily implies the meaning of another sentence and answer 'entailment' or 'not_entailment'.
Language: korean, acc: 35.10%, prompt: By understanding the relations between sentences, judge whether a given sentence necessarily refers to another sentence and answer with 'entailment' or 'not_entailment'.
Language: korean, acc: 34.30%, prompt: Evaluate whether a given text necessarily indicates the meaning of another text and respond with 'entailment' or 'not_entailment'.
Language: korean, acc: 40.00%, prompt: Understand the relations of a sentence, to determine whether a given sentence necessarily includes other sentences and answer with 'entailment' or 'not_entailment'.
Language: korean, acc: 42.30%, prompt: Judge whether a given content necessarily implies the meaning of another content and answer with 'entailment' or 'not_entailment'.
Language: korean, acc: 31.70%, prompt: Grasp the relations between sentences, determine if a given sentence necessarily contains the meaning of another sentence and respond with 'entailment' or 'not_entailment'.
Language: korean, acc: 42.50%, prompt: Evaluate whether a given text necessarily refers to another text and answer with 'entailment' or 'not_entailment'.
Language: korean, acc: 29.30%, prompt: By comparing the meaning of the sentences, to determine if a given sentence necessarily implies another sentence and answer 'entailment' or 'not_entailment'.
Language: korean, acc: 44.50%, prompt: Evaluate whether the contents given necessarily refer to other contents and answer with 'entailment' or 'not_entailment'.
Language: korean, acc: 35.50%, prompt: By analyzing the relations between sentences, determine if a given sentence does not necessarily include other sentences and answer with 'entailment' or 'not_entailment'.

# qqp

## 10 prompts

Acc: 30.00%, prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'.
Acc: 24.10%, prompt: Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent'.
Acc: 23.90%, prompt: Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'.
Acc: 22.60%, prompt: Assess whether the following statements are identical in meaning by answering 'equivalent' or 'not_equivalent'.
Acc: 22.40%, prompt: Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'.
Acc: 20.60%, prompt: Do the following expressions mean the same thing? Provide your answer as 'equivalent' or 'not_equivalent'.
Acc: 20.40%, prompt: Examine the following expressions and tell me if they are alike in meaning by using 'equivalent' or 'not_equivalent'.
Acc: 20.10%, prompt: Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent'.
Acc: 18.30%, prompt: Are the following two questions equivalent or not? Answer me with "equivalent" or "not_equivalent".
Acc: 15.40%, prompt: Evaluate whether these two phrases have identical meanings and respond with 'equivalent' or 'not_equivalent'.
Acc: 6.00%, prompt: Are the meanings of these two phrases the same? Reply with 'equivalent' or 'not_equivalent'.

Acc: 21.10%, prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions.
Acc: 16.90%, prompt: In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions.
Acc: 12.50%, prompt: Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones.
Acc: 11.60%, prompt: Functioning as a question similarity evaluation tool, analyze the given questions and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'.
Acc: 11.30%, prompt: Working as a question classifier, identify whether the given questions share the same meaning, and respond with 'equivalent' for equivalent questions or 'not_equivalent' for different ones.
Acc: 10.80%, prompt: In the capacity of a question assessment system, indicate if the meaning of the provided questions is the same, responding with 'equivalent' or 'not_equivalent'.
Acc: 9.70%, prompt: As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent'.
Acc: 9.10%, prompt: As a question equivalence detection system, examine the provided questions and respond with 'equivalent' if they are the same in meaning, or 'not_equivalent' if they are different.
Acc: 7.60%, prompt: While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions.
Acc: 5.50%, prompt: In your role as a question comparison tool, assess the following pair of questions and classify them as 'equivalent' or 'not_equivalent'.

## bertattack

Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"please indicate if the cross over of items share the same meaning by responding with 'equivalent' or 'not_equivalent'. "
Original acc: 30.00%, attacked acc: 1.50%, dropped acc: 28.50%

Original prompt: Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"is if the given set of and have the same connotation by asking with 'equivalent' or 'not_equivalent'. "
Original acc: 24.10%, attacked acc: 2.00%, dropped acc: 22.10%

Original prompt: Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"can these two symbols be considered equal in meaning? object with 'equivalent' or 'not_equivalent'. "
Original acc: 23.90%, attacked acc: 0.80%, dropped acc: 23.10%

Original prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions. Attacked prompt: b"the an is for of comparison analyze, assess the questions and determine if their content is the same, responding with 'equivalent' for corresponding words or 'not_equivalent' for different questions. " Original acc: 20.90%, attacked acc: 1.20%, dropped acc: 19.70%

Original prompt: In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions. Attacked prompt: b"into the role of a subject similarity analyzer, review the meaning of the drawn lines and consider if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different topics. " Original acc: 16.80%, attacked acc: 1.00%, dropped acc: 15.80%

Original prompt: Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones. Attacked prompt: b"Acting as a general equivalence instrument, consider if the provided topics are equivalent in meaning, answering with 'equivalent' for familiar stimuli or 'not_equivalent' for dissimilar ones. " Original acc: 12.30%, attacked acc: 0.40%, dropped acc: 11.90%

## checklist

Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent' 6a4Yn3RGVc. "
Original acc: 30.00%, attacked acc: 9.20%, dropped acc: 20.80%

Original prompt: Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent' 6a4Yn3RGVc. "
Original acc: 24.20%, attacked acc: 4.90%, dropped acc: 19.30%

Original prompt: Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent' 6a4Yn3RGVc. "
Original acc: 23.70%, attacked acc: 2.10%, dropped acc: 21.60%

Original prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions.
Attacked prompt: b"As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions 5XeflW1ZJc. "
Original acc: 20.90%, attacked acc: 1.20%, dropped acc: 19.70%

Original prompt: In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions.
Attacked prompt: b"In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions w52rwgo0Av. "
Original acc: 17.00%, attacked acc: 10.10%, dropped acc: 6.90%

Original prompt: Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones.
Attacked prompt: b"Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones 9vGXgnbWB8. "
Original acc: 12.30%, attacked acc: 5.30%, dropped acc: 7.00%

## deepwordbug

Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Please indicatJ if the folllwing paWr of setnences share the same maning by respondiWng with 'equivalent' or 'not_equivalent'. "
Original acc: 30.00%, attacked acc: 2.20%, dropped acc: 27.80%

Original prompt: Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"tAnalyze if the giCen est of sentUnces have the same connotatioq by answering with 'equivalent' or 'not_equivalent'. "
Original acc: 24.10%, attacked acc: 2.00%, dropped acc: 22.10%

Original prompt: Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Cna these wo Ktatements be considered equad in meaning? UAnswer with 'equivalent' or 'not_equivalent'. "
Original acc: 23.90%, attacked acc: 9.20%, dropped acc: 14.70%

Original prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions.
Attacked prompt: b"s an istrument for question comparison evaluation, cosnider the questions and determine if their Teaning is the same, rysponding with 'equivalent' for similr euestions or 'not_equivalent' for different questidons. "
Original acc: 21.10%, attacked acc: 2.60%, dropped acc: 18.50%

Original prompt: In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions.
Attacked prompt: b"IZn the role of a questoion similarity naalyzer, examin the meJaning of the provided questions and Uecide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for dSfferent questioXns. "
Original acc: 16.90%, attacked acc: 0.70%, dropped acc: 16.20%

Original prompt: Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones.
Attacked prompt: b"Acting as a question equivalence instrument, determine if the Oprovided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones. "
Original acc: 12.50%, attacked acc: 7.40%, dropped acc: 5.10%

## stresstest

Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 30.00%, attacked acc: 30.40%, dropped acc: -0.40%

Original prompt: Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent'  and false is not true . "
Original acc: 24.10%, attacked acc: 30.10%, dropped acc: -6.00%

Original prompt: Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 23.90%, attacked acc: 27.70%, dropped acc: -3.80%

Original prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions.
Attacked prompt: b"As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 21.10%, attacked acc: 21.50%, dropped acc: -0.40%

Original prompt: In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions.
Attacked prompt: b"In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 16.90%, attacked acc: 22.40%, dropped acc: -5.50%

Original prompt: Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones.
Attacked prompt: b"Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 12.50%, attacked acc: 22.20%, dropped acc: -9.70%

## textbugger

Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Please indicate if the later pair of penalties share the same meaning by responding with 'equivalent' or 'not_equivalent'. "
Original acc: 30.00%, attacked acc: 2.00%, dropped acc: 28.00%

Original prompt: Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Analyze if the given setting of sente\xd5\xb8ces have the same connotation by replying with 'equivalent' or 'not_equivalent'. "
Original acc: 24.10%, attacked acc: 1.50%, dropped acc: 22.60%

Original prompt: Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Possible these two statemen\xf0\x9d\x9a\x9ds be considered equal in meaing? Answer with 'equivalent' or 'not_equivalent'. "
Original acc: 23.90%, attacked acc: 0.20%, dropped acc: 23.70%

Original prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions.
Attacked prompt: b"Because an instrument for problem comparison evaluation, examining the subjects and defining if their \xef\xbd\x8deaning is the same, \xe2\xb2\x85esponding with 'equivalent' for similar questions or 'not_equivalent' for different questions. "
Original acc: 21.10%, attacked acc: 0.40%, dropped acc: 20.70%

Original prompt: In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions.
Attacked prompt: b"I\xd5\xb8 the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different ques\xf0\x9d\x9a\x9dions. "
Original acc: 16.90%, attacked acc: 0.00%, dropped acc: 16.90%

Original prompt: Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones.
Attacked prompt: b"Caretaker as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for \xd1\x95imilar ques\xf0\x9d\x9a\x9dions or 'not_equivalent' for dissimilar ones. "
Original acc: 12.50%, attacked acc: 0.00%, dropped acc: 12.50%

## textfooler

Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Please specify if the henceforth matched of verdicts share the same significance by responding with 'equivalent' or 'not_equivalent'. "
Original acc: 30.00%, attacked acc: 0.70%, dropped acc: 29.30%

Original prompt: Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Analyzes if the submitted established of chastisement possess the same subtext by responding with 'equivalent' or 'not_equivalent'. "
Original acc: 24.10%, attacked acc: 2.20%, dropped acc: 21.90%

Original prompt: Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'.
Attacked prompt: b"Kan these two languages happen revisited equitable in signify? Meet with 'equivalent' or 'not_equivalent'. "
Original acc: 23.90%, attacked acc: 0.00%, dropped acc: 23.90%

Original prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions.
Attacked prompt: b"Iike an instrument for question comparison valuation, exams the questions and detects if their meaning is the same, reacting with 'equivalent' for similar questions or 'not_equivalent' for different themes. "
Original acc: 21.10%, attacked acc: 1.80%, dropped acc: 19.30%

Original prompt: In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions.
Attacked prompt: b"Onto the role of a question similarity analyzer, consider the significance of the envisaged matters and identify if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for many matters. "
Original acc: 16.90%, attacked acc: 0.10%, dropped acc: 16.80%

Original prompt: Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones.
Attacked prompt: b"Acting as a question equivalence utilities, assess if the presented queries are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for different ones. "
Original acc: 12.50%, attacked acc: 1.40%, dropped acc: 11.10%

## translation

Language: chinese, acc: 2.90%, prompt: As a tool for determining the equivalence of problems, the problems are examined and their similarity is classified as 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 9.40%, prompt: Please compare the similarity of two sentences and put them into the category of 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 4.50%, prompt: Two sentences are classified as 'equivalent' or 'not_equivalent' for their similarity.
Language: chinese, acc: 1.70%, prompt: You can determine how similar the questions are by comparing them and categorizing them as 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 6.80%, prompt: Using the method of contrast, the similarity of these problems is divided into two categories: 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 3.60%, prompt: By comparing these issues, you can classify them as 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 10.20%, prompt: To determine whether the questions are similar, put them into the category of 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 17.80%, prompt: Divide the similarity of these questions into 'equivalent' or 'not_equivalent' categories.
Language: chinese, acc: 2.10%, prompt: Using the similarity assessment tool, these questions were classified as 'equivalent' or 'not_equivalent'.
Language: chinese, acc: 1.50%, prompt: By analyzing the similarity of these problems, they are divided into categories of 'equivalent' or 'not_equivalent'.
Language: french, acc: 3.60%, prompt: As a tool to determine the equivalence of questions, review the questions and rank their similarity as 'equivalent' or 'not_equivalent'.
Language: french, acc: 15.20%, prompt: Please compare the similarity of two sentences and classify them as 'equivalent' or 'not_equivalent'.
Language: french, acc: 7.00%, prompt: Based on the similarity of two sentences, classify them as 'equivalent' or 'not_equivalent'.
Language: french, acc: 2.10%, prompt: You can determine the similarity between these questions by comparing them and classifying them as 'equivalent' or 'not_equivalent'.
Language: french, acc: 16.10%, prompt: Use a comparative method to divide the similarity of these questions into two categories: 'equivalent' or 'not_equivalent'.
Language: french, acc: 3.80%, prompt: By comparing these questions, you can classify them as 'equivalent' or 'not_equivalent'.
Language: french, acc: 13.40%, prompt: Determine whether these questions are similar or not, and then classify them as 'equivalent' or 'not_equivalent'.
Language: french, acc: 20.80%, prompt: Divide the similarity of these questions into two categories: 'equivalent' or 'not_equivalent'.
Language: french, acc: 6.80%, prompt: Use a similarity assessment tool to classify these questions as 'equivalent' or 'not_equivalent'.
Language: french, acc: 5.60%, prompt: By analyzing the similarity of these questions, you can divide them into two categories: 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 10.10%, prompt: As a tool for determining an equation of questions, review the questions and classify their similarity as either 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 6.60%, prompt: When using questions in the classification domain, please classify the similarity between the questions as 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 4.60%, prompt: To determine an equation of questions, you must review the questions and classify their similarity as 'equivalent' or 'not_equivalent'.
Language: arabic, acc: 6.50%, prompt: Questions can be classified as 'equivalent' or 'not_equivalent' when used to identify classifications.
Language: arabic, acc: 4.20%, prompt: Classification of question similarity as 'equivalent' or 'not_equivalent' is used as a tool to determine the classification of questions.
Language: arabic, acc: 18.10%, prompt: Classify the similarity of the questions as 'equivalent' or 'not_equivalent' to determine the equation of the questions.
Language: arabic, acc: 7.20%, prompt: Identifying the similarity of questions and classifying them as 'equivalent' or 'not_equivalent' is an important tool in determining the classification of questions.
Language: arabic, acc: 3.60%, prompt: When classifying questions, their similarity can be classified as 'equivalent' or 'not_equivalent' to determine the correct classification.
Language: arabic, acc: 4.10%, prompt: The similarity of questions should be classified as 'equivalent' or 'not_equivalent' when used to determine the equation of questions.
Language: arabic, acc: 3.90%, prompt: Identifying the similarity of questions and classifying them as 'equivalent' or 'not_equivalent' helps to correctly classify questions.
Language: spanish, acc: 2.10%, prompt: As a tool to determine the equivalence of questions, it reviews the questions and classifies their similarity as 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 7.70%, prompt: Evaluate the similarity between questions and classify them as 'equivalent' or 'not_equivalent' to determine their equivalence.
Language: spanish, acc: 19.00%, prompt: Determine whether two questions are 'equivalent' or 'not_equivalent' based on similarity and characteristics.
Language: spanish, acc: 2.30%, prompt: Classifies the similarity between questions as 'equivalent' or 'not_equivalent' to determine their equivalence.
Language: spanish, acc: 6.10%, prompt: Review the questions and rate them as 'equivalent' or 'not_equivalent' based on their similarity and content.
Language: spanish, acc: 1.50%, prompt: As part of the classification task of questions, it determines their equivalence by categorizing their similarity as 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 8.20%, prompt: Analyze the similarity between questions and classify them as 'equivalent' or 'not_equivalent' to determine their equivalence.
Language: spanish, acc: 2.50%, prompt: As a method of identifying the equivalence of questions, it categorizes their similarity as 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 1.30%, prompt: To determine the equivalence between questions, check their similarity and classify them as 'equivalent' or 'not_equivalent'.
Language: spanish, acc: 18.10%, prompt: Classify the similarity between questions as 'equivalent' or 'not_equivalent' to determine whether they are equivalent or not.
Language: japanese, acc: 6.70%, prompt: As a tool to determine the equivalence of the question, review the question and categorize its similarities into 'equivalent' or 'not_equivalent' categories.
Language: japanese, acc: 0.50%, prompt: Work on text sorting tasks labeled 'equivalent' or 'not_equivalent'.
Language: japanese, acc: 4.70%, prompt: For text classification tasks, use the labels 'equivalent' or 'not_equivalent' to determine the equivalence of statements.
Language: japanese, acc: 6.30%, prompt: In the MRPC dataset, use the labels 'equivalent' or 'not_equivalent' to classify the equivalence of statements.
Language: japanese, acc: 6.00%, prompt: As a tool for determining equivalence, check sentences and categorize them into 'equivalent' or 'not_equivalent' categories.
Language: japanese, acc: 11.30%, prompt: Use the labels 'equivalent' or 'not_equivalent' to determine the equivalence of statements in text classification tasks.
Language: japanese, acc: 8.30%, prompt: In the text classification task of the MRPC data set, classify the equivalence of statements with labels of 'equivalent' or 'not_equivalent'.
Language: japanese, acc: 5.10%, prompt: As a tool to determine the equivalence of statements, categorize statements into 'equivalent' or 'not_equivalent' categories.
Language: japanese, acc: 10.40%, prompt: In a text classification task, classify the equivalence of statements using labels of 'equivalent' or 'not_equivalent'.
Language: japanese, acc: 6.60%, prompt: Do a text classification task to determine the equivalence of statements, labeled 'equivalent' or 'not_equivalent'.
Language: korean, acc: 11.20%, prompt: Classify two given sentences as 'equivalent' or 'not_equivalent' by discriminating whether they have the same meaning.
Language: korean, acc: 8.80%, prompt: Determine sentence equivalence by judging the similarity of two sentences with 'equivalent' or 'not_equivalent'.
Language: korean, acc: 4.30%, prompt: Classify the similarity of sentences as 'equivalent' or 'not_equivalent' by judging whether two sentences have the same meaning.
Language: korean, acc: 12.70%, prompt: Determine if two given sentences are equivalent to each other, and classify their similarity as 'equivalent' or 'not_equivalent'.
Language: korean, acc: 13.20%, prompt: Compare two given sentences to determine sentence equivalence, and classify their similarities as 'equivalent' or 'not_equivalent'.
Language: korean, acc: 8.60%, prompt: Classify sentence equivalence as 'equivalent' or 'not_equivalent' by judging whether two sentences have the same meaning to each other.
Language: korean, acc: 12.00%, prompt: Determine if two sentences have the same meaning, and classify their similarities as 'equivalent' or 'not_equivalent'.
Language: korean, acc: 12.00%, prompt: Compare two given sentences to determine their equivalence, and classify their similarities as 'equivalent' or 'not_equivalent'.
Language: korean, acc: 9.40%, prompt: Review two sentences to evaluate sentence equivalence, and classify their similarities as 'equivalent' or 'not_equivalent'.
Language: korean, acc: 9.90%, prompt: Judge whether two sentences have the same meaning to each other, and determine the sentence equivalence with 'equivalent' or 'not_equivalent'.

# rte

## 10 prompts

Acc: 47.65%, prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'.
Acc: 45.13%, prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Acc: 41.16%, prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Acc: 40.43%, prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'.
Acc: 37.91%, prompt: Indicate if the connection between the following sentences is 'entailment' or 'not_entailment'.
Acc: 32.85%, prompt: Review the two statements and categorize their relationship as either 'entailment' or 'not_entailment'.
Acc: 28.16%, prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'.
Acc: 27.80%, prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'.
Acc: 18.05%, prompt: Analyze the two provided sentences and decide if their relationship is 'entailment' or 'not_entailment'.
Acc: 10.47%, prompt: Considering the two sentences, identify if their relationship is 'entailment' or 'not_entailment'.
Acc: 5.42%, prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word.

Acc: 51.26%, prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Acc: 49.82%, prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':
Acc: 48.01%, prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Acc: 44.04%, prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment':
Acc: 44.04%, prompt: In the capacity of an entailment assessment system, indicate if the link between the following sentences is 'entailment' or 'not_entailment':
Acc: 43.32%, prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment':
Acc: 42.96%, prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Acc: 27.08%, prompt: As a tool for determining entailment relationships, review the two statements and categorize their connection as either 'entailment' or 'not_entailment':
Acc: 24.91%, prompt: As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respond with 'entailment' or 'not_entailment':
Acc: 23.10%, prompt: While performing entailment analysis, classify the relationship between the provided sentences as 'entailment' or 'not_entailment':

## bertattack

Original prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'.
Attacked prompt: b"Please approach the relationship between the included sentences as 'entailment' or 'not_entailment'."
Original acc: 47.65%, attacked acc: 0.00%, dropped acc: 47.65%

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"i the relationship between the given phrases represent entailment or not_entailment? answer with 'entailment' or 'not_entailment'."
Original acc: 45.13%, attacked acc: 3.25%, dropped acc: 41.88%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"be if the provided pair of sentences displays entailment or not_entailment. approach with 'entailment' or 'not_entailment'."
Original acc: 41.16%, attacked acc: 1.81%, dropped acc: 39.35%

Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Attacked prompt: b"you the role of a semantic relationship analyzeto, study the connection between the and statements and feel if they exhibit entailment or not_entailment. approach with 'entailment' or 'not_entailment':"
Original acc: 51.26%, attacked acc: 3.25%, dropped acc: 48.01%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':
Attacked prompt: b"j as an entailment classifier, remember do the given pair of sentence displays entailment or not_entailment. responded with 'entailment' or 'not_entailment':"
Original acc: 49.82%, attacked acc: 2.17%, dropped acc: 47.65%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"she an entailment identification system, presented the connection between the following sentences and answered with 'entailment' or 'not_entailment':"
Original acc: 48.01%, attacked acc: 1.44%, dropped acc: 46.57%

## checklist

Original prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'.
Attacked prompt: b"Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment' vTAjHynoIG."
Original acc: 47.65%, attacked acc: 23.47%, dropped acc: 24.19%

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment' 9vGXgnbWB8."
Original acc: 44.77%, attacked acc: 31.05%, dropped acc: 13.72%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment' vTAjHynoIG."
Original acc: 41.16%, attacked acc: 1.81%, dropped acc: 39.35%

Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Attacked prompt: b"In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment' mzjVogkRhM:"
Original acc: 51.26%, attacked acc: 20.94%, dropped acc: 30.32%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':
Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment' vTAjHynoIG:"
Original acc: 49.82%, attacked acc: 13.36%, dropped acc: 36.46%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment' RcRneWo6Iv:"
Original acc: 48.38%, attacked acc: 25.27%, dropped acc: 23.10%

## deepwordbug

Original prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'.
Attacked prompt: b"Please classiyf the relationship between the provided sentencCs as 'entailment' or 'not_entailment'."
Original acc: 47.65%, attacked acc: 13.72%, dropped acc: 33.94%

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"DoeGs the relationship between the give sentence represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'."
Original acc: 45.13%, attacked acc: 5.42%, dropped acc: 39.71%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"DetermiCne if the givePn pir of sentcnces disWlays entailment or not_entailment. Rtspond with 'entailment' or 'not_entailment'."
Original acc: 41.16%, attacked acc: 1.08%, dropped acc: 40.07%

Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Attacked prompt: b"nI the rolQ of a semRantic relatiVonship analyzer, examine the coHnection between the igven sentences and deciEde if they exhibit entailment or not_entailment. AnswVr with 'entailment' or 'not_entailment':"
Original acc: 51.26%, attacked acc: 3.61%, dropped acc: 47.65%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':
Attacked prompt: b"orking as an entailment clasifier, idenStify whethre the givec ptair of sentInces displays entailment or not_entailment. ReKspond with 'entailment' or 'not_entailment':"
Original acc: 49.82%, attacked acc: 1.08%, dropped acc: 48.74%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"sA an entailment identification ysstem, examien the lconnection between the follUowing sentences and frespond with 'entailment' or 'not_entailment':"
Original acc: 48.01%, attacked acc: 8.30%, dropped acc: 39.71%

## stresstest

Original prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'.
Attacked prompt: b"Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true ."
Original acc: 47.65%, attacked acc: 23.47%, dropped acc: 24.19%

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'  and false is not true ."
Original acc: 45.13%, attacked acc: 36.46%, dropped acc: 8.66%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'  and false is not true ."
Original acc: 41.16%, attacked acc: 31.77%, dropped acc: 9.39%

Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Attacked prompt: b"In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment'  and false is not true :"
Original acc: 51.26%, attacked acc: 41.88%, dropped acc: 9.39%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':
Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true :"
Original acc: 49.82%, attacked acc: 47.29%, dropped acc: 2.53%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true :"
Original acc: 48.01%, attacked acc: 40.07%, dropped acc: 7.94%

## textbugger

Original prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'.
Attacked prompt: b"P\xe2\x85\xbcease \xcf\xb2lassify the relationship between the provided sentences as 'entailment' or 'not_entailment'."
Original acc: 47.65%, attacked acc: 6.14%, dropped acc: 41.52%

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Ai the relationship between the given sentence represent entailment or not_entailment? Answer with 'entailment' or 'not_entailment'."
Original acc: 45.13%, attacked acc: 1.08%, dropped acc: 44.04%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Determines if the awarded \xd1\x80air of sentence displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'."
Original acc: 41.16%, attacked acc: 3.25%, dropped acc: 37.91%

Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Attacked prompt: b"Onto the role of a semantic relationship offing, e\xc3\x97amine the connection between the given se\xd5\xb8tences and dec\xd1\x96de if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment':"
Original acc: 51.26%, attacked acc: 0.36%, dropped acc: 50.90%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':
Attacked prompt: b"Wo\xe2\xb2\x85king as an entailment classifier, detect whether the conferred pair of \xd1\x95entences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':"
Original acc: 49.82%, attacked acc: 3.97%, dropped acc: 45.85%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"Oj an entailment identification system, examine the connection between the following sentences and resp\xd0\xbend with 'entailment' or 'not_entailment':"
Original acc: 48.01%, attacked acc: 15.88%, dropped acc: 32.13%

## textfooler

Original prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'.
Attacked prompt: b"Invite classed the coitus between the foreseen priors as 'entailment' or 'not_entailment'."
Original acc: 47.65%, attacked acc: 0.72%, dropped acc: 46.93%

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Ca the relationship between the given sentences represent entailment or not_entailment? Answer with 'entailment' or 'not_entailment'."
Original acc: 45.13%, attacked acc: 4.69%, dropped acc: 40.43%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Establishes if the gave pair of phrases reflecting entailment or not_entailment. Reply with 'entailment' or 'not_entailment'."
Original acc: 41.16%, attacked acc: 5.78%, dropped acc: 35.38%

Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Attacked prompt: b"Onto the feature of a semantic link profiler, scrutinize the liaison between the offered chastisement and determining if they demonstrate entailment or not_entailment. Answer with 'entailment' or 'not_entailment':"
Original acc: 51.26%, attacked acc: 2.89%, dropped acc: 48.38%

Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':
Attacked prompt: b"Employment as an entailment classifier, ascertain whether the submitted pair of sentencing exhibits entailment or not_entailment. Respond with 'entailment' or 'not_entailment':"
Original acc: 49.82%, attacked acc: 2.17%, dropped acc: 47.65%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"Because an entailment characterization regimes, considering the login between the following sentence and meet with 'entailment' or 'not_entailment':"
Original acc: 48.01%, attacked acc: 0.00%, dropped acc: 48.01%

## translation

Language: chinese, acc: 43.32%, prompt: In the light of an implication analysis tool, evaluate the relationship between the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 41.52%, prompt: From the perspective of an implication analysis tool, determine whether there is an implication relationship in the following sentences by classifying them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 36.10%, prompt: Please use an implication analysis tool to determine whether an implication relationship exists in the following sentences by classifying them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 28.88%, prompt: Please evaluate the relation of the following sentences as 'entailment' or 'not_entailment' for the purpose of determining implication relation.
Language: chinese, acc: 40.43%, prompt: Please use the implication analysis tool to evaluate the relationships between the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 32.85%, prompt: For the purpose of determining implicative relations, analyze the relations of the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 42.60%, prompt: Please use the implication analysis tool to determine the relationship of the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 28.88%, prompt: Please use the implication judgment tool to assess the relevance of the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 20.22%, prompt: Please, with implication analysis as the main task, determine the relationships between the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 35.38%, prompt: Using the implication judgment as a criterion, analyze the relation of the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: french, acc: 40.07%, prompt: As an engagement analysis tool, evaluate the relationship between the given sentences and classify it as 'entailment' or 'not_entailment'.
Language: french, acc: 31.77%, prompt: Determine whether the given sentences involve one another or not as an implication analysis tool. Classify them accordingly as 'entailment' or 'not_entailment'.
Language: french, acc: 38.99%, prompt: Using implication analysis, evaluate whether the sentences provided have a logical relationship and categorize them as 'entailment' or 'not_entailment'.
Language: french, acc: 35.74%, prompt: As an engagement assessment tool, determine whether the sentences provided have a logical relationship and classify them as 'entailment' or 'not_entailment'.
Language: french, acc: 26.35%, prompt: As an implication classification tool, analyze the sentences provided to determine if there is a logical relationship and categorize them as 'entailment' or 'not_entailment'.
Language: french, acc: 25.63%, prompt: Using implication analysis, determine whether the given sentences have a cause-effect relationship and categorize them as 'entailment' or 'not_entailment'.
Language: french, acc: 34.66%, prompt: Evaluate the relationship between the given sentences using implication analysis and rank them accordingly as 'entailment' or 'not_entailment'.
Language: french, acc: 24.55%, prompt: As an engagement detection tool, determine whether the given sentences have a logical relationship and categorize them as 'entailment' or 'not_entailment'.
Language: french, acc: 13.36%, prompt: Using implication analysis, evaluate whether the sentences provided have a cause-effect relationship and rank them accordingly as 'entailment' or 'not_entailment'.
Language: french, acc: 14.80%, prompt: Determine whether the given sentences have a cause-effect relationship as an engagement analysis tool and categorize them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 36.82%, prompt: In your role as a tool for reasoning analysis, evaluate the relationship between given sentences and classify them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 45.85%, prompt: Can you determine whether this sentence is inferred from the other sentence? Classify it as 'entailment' or 'not_entailment'.
Language: arabic, acc: 32.49%, prompt: Using the tool of reasoning analysis, analyze the relationship between given sentences and classify them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 36.10%, prompt: Does this sentence represent a conclusion from the previous sentence? Classify it as 'entailment' or 'not_entailment'.
Language: arabic, acc: 32.85%, prompt: As a tool of reasoning analysis, evaluate the relationship of given sentences and classify them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 36.46%, prompt: Can this sentence be inferred from the previous sentence? Classify it as 'entailment' or 'not_entailment'.
Language: arabic, acc: 32.49%, prompt: Using a tool to analyze a conclusion, analyze the relationship between the two sentences and classify them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 22.74%, prompt: Is this a conclusion from the next sentence? Classify it as 'entailment' or 'not_entailment'.
Language: arabic, acc: 23.83%, prompt: As part of your task in analyzing a conclusion, evaluate the relationship between the two sentences and classify them as 'entailment' or 'not_entailment' based on their relationship.
Language: arabic, acc: 24.55%, prompt: Are you following this sentence directly from the previous one? Classify it as 'entailment' or 'not_entailment'.
Language: spanish, acc: 40.79%, prompt: In your role as an implication analysis tool, evaluate the relationship between the given phrases and classify them as 'entailment' or 'not_entailment'.
Language: spanish, acc: 32.13%, prompt: Determine whether the second sentence necessarily implies the first and label the relation as 'entailment', or as 'not_entailment' if not.
Language: spanish, acc: 16.61%, prompt: Classifies the relationship between these two sentences as 'entailment' if one necessarily implies the other, or as 'not_entailment' if not.
Language: spanish, acc: 27.08%, prompt: Evaluates whether the information in the second sentence is implied in the first and labels the relationship as 'entailment', or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 37.55%, prompt: Given a couple of phrases, label their relationship as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 40.79%, prompt: Analyzes the relationship between the phrases and classifies them as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 46.93%, prompt: Given two sentences, determine whether the second sentence is a necessary consequence of the first and label the relation as 'entailment', or as 'not_entailment' if not.
Language: spanish, acc: 24.55%, prompt: Evaluates whether the information presented in the second sentence is implicit in the first and labels the relationship as 'entailment', or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 20.58%, prompt: Classifies the relationship between the given phrases as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 26.35%, prompt: Determines whether the information provided in the second sentence is necessarily inferable from the first and labels the relationship as 'entailment', or as 'not_entailment' if not.
Language: japanese, acc: 27.80%, prompt: Analyze the relationship of a given sentence and classify it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 27.80%, prompt: Evaluate the semantic relationship of the sentence and classify it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 40.79%, prompt: Please judge the relationship between the given sentences and classify them as 'entailment' or 'not_entailment'.
Language: japanese, acc: 24.91%, prompt: Examine the paraphrases of a given sentence and classify them 'entailment' or 'not_entailment'.
Language: japanese, acc: 18.05%, prompt: Rate the similarity of a given sentence and categorize it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 18.41%, prompt: Determinate the semantic connections of a given sentence and classify it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 22.02%, prompt: Examine the semantic match of a given sentence and categorize it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 5.42%, prompt: Classify it as 'entailment' or 'not_entailment' based on the content of the sentence.
Language: japanese, acc: 27.80%, prompt: Analyze the relationship of a given sentence and classify it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 24.19%, prompt: Judge the semantic connections of a given sentence and categorize it as 'entailment' or 'not_entailment'.
Language: korean, acc: 37.91%, prompt: Evaluate the relationship between any two sentences given to you and classify you as 'entailment' or 'not_entailment'.
Language: korean, acc: 27.44%, prompt: Analyze the semantic deductive relations between sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 28.16%, prompt: Evaluate the logical relevance between sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 35.02%, prompt: Evaluate the interaction of two given sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 15.88%, prompt: Please check whether there is a semantic match between those two sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 31.77%, prompt: Compare information between given sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 15.88%, prompt: Please analyse the correlation between those two sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 20.22%, prompt: Evaluate the different meanings between given sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 20.58%, prompt: Compare the semantic structure of the two sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 29.24%, prompt: Evaluate the interactions between sentences and classify them as 'entailment' or 'not_entailment'.

# sst2

## 10 prompts

Acc: 53.90%, prompt: Please identify the emotional tone of this passage: 'positive' or 'negative'?
Acc: 48.17%, prompt: After examining the following expression, label its emotion as either 'positive' or 'negative':
Acc: 48.05%, prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative':
Acc: 43.92%, prompt: Review this statement and decide whether it has a 'positive' or 'negative' sentiment:
Acc: 42.43%, prompt: Considering the given phrase, would you say it carries a 'positive' or 'negative' connotation?
Acc: 33.37%, prompt: Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment:
Acc: 32.34%, prompt: Assess the mood of the following quote and determine if it's 'positive' or 'negative':
Acc: 23.85%, prompt: Given the context of this text, indicate if the emotion conveyed is 'positive' or 'negative':
Acc: 17.89%, prompt: Evaluate the sentiment of the given text and classify it as 'positive' or 'negative':
Acc: 13.42%, prompt: Determine the overall sentiment of this sentence, categorizing it as 'positive' or 'negative':

Acc: 42.55%, prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify:
Acc: 40.94%, prompt: Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify:
Acc: 37.27%, prompt: As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify:
Acc: 36.35%, prompt: In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement:
Acc: 29.24%, prompt: As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify:
Acc: 25.00%, prompt: Serving as a sentiment evaluation model, determine if the given statement is 'positive' or 'negative'. Classify:
Acc: 18.92%, prompt: Acting as a sentiment evaluator, identify if the given sentence is 'positive' or 'negative'. Classify:
Acc: 12.73%, prompt: In the capacity of a sentiment classifier, decide whether the given quote is 'positive' or 'negative'. Classify:
Acc: 3.10%, prompt: Emulating a sentiment classification system, indicate whether the provided text is 'positive' or 'negative'. Classify:
Acc: 2.98%, prompt: Taking on the role of an emotion classifier, specify if the provided phrase is 'positive' or 'negative'. Classify:

## bertattack

Original prompt: Please identify the emotional tone of this passage: 'positive' or 'negative'?
Attacked prompt: b"need refer the affect tone of this passage: 'positive' or 'negative'? "
Original acc: 53.90%, attacked acc: 3.21%, dropped acc: 50.69%

Original prompt: After examining the following expression, label its emotion as either 'positive' or 'negative':
Attacked prompt: b"After examination the given expression, type its emotion as of 'positive' or 'negative': "
Original acc: 48.17%, attacked acc: 0.57%, dropped acc: 47.59%

Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative':
Attacked prompt: b"we the tone of this statement and comment with either 'positive' or 'negative': "
Original acc: 48.05%, attacked acc: 4.47%, dropped acc: 43.58%

Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify:
Attacked prompt: b"Working as a sentiment analyzer, ranks indicates if the below text is 'positive' or 'negative'. ranking: "
Original acc: 42.55%, attacked acc: 3.10%, dropped acc: 39.45%

Original prompt: Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify:
Attacked prompt: b"Functioning as a flag identification tool, measure if the following expression is 'positive' or 'negative'. classifications: "
Original acc: 40.94%, attacked acc: 7.00%, dropped acc: 33.94%

Original prompt: As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify:
Attacked prompt: b"As an emotion indicating, score if the provided reply conveys a 'positive' or 'negative' sentiment. classify: "
Original acc: 37.27%, attacked acc: 0.46%, dropped acc: 36.81%

## checklist

Original prompt: Please identify the emotional tone of this passage: 'positive' or 'negative'?
Attacked prompt: b"Please identify the emotional tone of this passage: 'positive' or 'negative' 9vGXgnbWB8? "
Original acc: 53.90%, attacked acc: 0.00%, dropped acc: 53.90%

Original prompt: After examining the following expression, label its emotion as either 'positive' or 'negative':
Attacked prompt: b"After examining the following expression, label its emotion as either 'positive' or 'negative' PDCGfz6DL9: "
Original acc: 48.17%, attacked acc: 30.96%, dropped acc: 17.20%

Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative':
Attacked prompt: b"Analyze the tone of this statement and respond with either 'positive' or 'negative' 9vGXgnbWB8: "
Original acc: 48.05%, attacked acc: 0.69%, dropped acc: 47.36%

Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify:
Attacked prompt: b"Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify zq0DcZ5dnI: "
Original acc: 42.55%, attacked acc: 4.70%, dropped acc: 37.84%

Original prompt: Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify:
Attacked prompt: b"Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify EAB4KP2NVY: "
Original acc: 40.94%, attacked acc: 21.22%, dropped acc: 19.72%

Original prompt: As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify:
Attacked prompt: b"As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify d6ZQ3u0GBQ: "
Original acc: 37.27%, attacked acc: 7.91%, dropped acc: 29.36%

## deepwordbug

Original prompt: Please identify the emotional tone of this passage: 'positive' or 'negative'?
Attacked prompt: b"Plase Sidentify the emotiLnal tone of this pasage: 'positive' or 'negative'? "
Original acc: 54.01%, attacked acc: 3.10%, dropped acc: 50.92%

Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative':
Attacked prompt: b"Analyze the otne of this stateXent and respond with eitJer 'positive' or 'negative': "
Original acc: 48.05%, attacked acc: 8.60%, dropped acc: 39.45%

Original prompt: After examining the following expression, label its emotion as either 'positive' or 'negative':
Attacked prompt: b"kfter eximining the followinM expression, labe its emolion as egther 'positive' or 'negative': "
Original acc: 48.05%, attacked acc: 0.80%, dropped acc: 47.25%

Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify:
Attacked prompt: b"Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. CQlassify: "
Original acc: 42.66%, attacked acc: 39.11%, dropped acc: 3.56%

Original prompt: Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify:
Attacked prompt: b"Functionnig as a sentiment identification tool, assess if the flolowing xepression is 'positive' or 'negative'. ClUssify: "
Original acc: 41.06%, attacked acc: 27.18%, dropped acc: 13.88%

Original prompt: As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify:
Attacked prompt: b"As an emoHtion Betector, determine if the provided passage convey a 'positive' or 'negative' sentiment. Classify: "
Original acc: 37.50%, attacked acc: 29.13%, dropped acc: 8.37%

## stresstest

Original prompt: Please identify the emotional tone of this passage: 'positive' or 'negative'?
Attacked prompt: b"Please identify the emotional tone of this passage: 'positive' or 'negative'  and true is true  and true is true  and true is true  and true is true  and true is true ? "
Original acc: 53.90%, attacked acc: 17.89%, dropped acc: 36.01%

Original prompt: After examining the following expression, label its emotion as either 'positive' or 'negative':
Attacked prompt: b"After examining the following expression, label its emotion as either 'positive' or 'negative'  and false is not true : "
Original acc: 48.17%, attacked acc: 34.86%, dropped acc: 13.30%

Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative':
Attacked prompt: b"Analyze the tone of this statement and respond with either 'positive' or 'negative'  and true is true  and true is true  and true is true  and true is true  and true is true : "
Original acc: 48.05%, attacked acc: 43.35%, dropped acc: 4.70%

Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify:
Attacked prompt: b"Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify  and false is not true : "
Original acc: 42.55%, attacked acc: 42.66%, dropped acc: -0.11%

Original prompt: Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify:
Attacked prompt: b"Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify  and false is not true : "
Original acc: 40.94%, attacked acc: 36.47%, dropped acc: 4.47%

Original prompt: As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify:
Attacked prompt: b"As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify  and true is true  and true is true  and true is true  and true is true  and true is true : "
Original acc: 37.27%, attacked acc: 40.83%, dropped acc: -3.56%

## textbugger

Original prompt: Please identify the emotional tone of this passage: 'positive' or 'negative'?
Attacked prompt: b"Please i\xd4\x81entify the emotional tone of this p\xc9\x91ssage: 'positive' or 'negative'? "
Original acc: 54.01%, attacked acc: 11.12%, dropped acc: 42.89%

Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative':
Attacked prompt: b"Analyzes the tone of this pronouncements and answer with either 'positive' or 'negative': "
Original acc: 48.05%, attacked acc: 18.81%, dropped acc: 29.24%

Original prompt: After examining the following expression, label its emotion as either 'positive' or 'negative':
Attacked prompt: b"After examining the following expression, label its emo\xf0\x9d\x9a\x9dion as neither 'positive' or 'negative': "
Original acc: 48.05%, attacked acc: 2.06%, dropped acc: 45.99%

Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify:
Attacked prompt: b"Working as a sentiment analyzer, invited indicate if the later text is 'positive' or 'negative'. Classi\xf0\x9d\x9a\x8fy: "
Original acc: 42.66%, attacked acc: 7.91%, dropped acc: 34.75%

Original prompt: Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify:
Attacked prompt: b"Functioning as a \xd1\x95entiment identification tools, evaluating if the foll\xd0\xbewing expressions is 'positive' or 'negative'. Classi\xf0\x9d\x9a\x8fy: "
Original acc: 41.06%, attacked acc: 3.33%, dropped acc: 37.73%

Original prompt: As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify:
Attacked prompt: b"Because an emotion detector, determine if the povided passgae convey a 'positive' or 'negative' ambience. Classi\xf0\x9d\x9a\x8fy: "
Original acc: 37.50%, attacked acc: 1.03%, dropped acc: 36.47%

## textfooler

Original prompt: Please identify the emotional tone of this passage: 'positive' or 'negative'?
Attacked prompt: b"Invited specify the affective tone of this pass: 'positive' or 'negative'? "
Original acc: 53.90%, attacked acc: 5.85%, dropped acc: 48.05%

Original prompt: After examining the following expression, label its emotion as either 'positive' or 'negative':
Attacked prompt: b"After investigate the later demonstrations, label its sentiment as either 'positive' or 'negative': "
Original acc: 48.17%, attacked acc: 11.47%, dropped acc: 36.70%

Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative':
Attacked prompt: b"Analyzes the ringtone of this declarations and answers with either 'positive' or 'negative': "
Original acc: 48.05%, attacked acc: 7.91%, dropped acc: 40.14%

Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify:
Attacked prompt: b"Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Ranked: "
Original acc: 42.55%, attacked acc: 3.21%, dropped acc: 39.33%

Original prompt: Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify:
Attacked prompt: b"Operated as a sense id utility, estimate if the subsequently expressions is 'positive' or 'negative'. Classifies: "
Original acc: 40.94%, attacked acc: 4.82%, dropped acc: 36.12%

Original prompt: As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify:
Attacked prompt: b"As an emotion sensor, identifying if the provided cross conveyed a 'positive' or 'negative' sens. Classify: "
Original acc: 37.27%, attacked acc: 12.96%, dropped acc: 24.31%

## translation

Language: chinese, acc: 36.58%, prompt: Answer whether the statement is 'positive' or 'negative' based on sentiment analysis.
Language: chinese, acc: 17.55%, prompt: As an emotion analysis tool, determine whether the emotion in the text is 'positive' or 'negative'.
Language: chinese, acc: 38.76%, prompt: Categorize the statement as 'positive' or 'negative', based on its emotional bent.
Language: chinese, acc: 11.12%, prompt: Please use sentiment analysis to classify the text as 'positive' or 'negative'.
Language: chinese, acc: 14.91%, prompt: Please determine whether the emotion of the sentence is 'positive' or 'negative' and categorize it.
Language: chinese, acc: 25.80%, prompt: Using sentiment analysis, classify the text as 'positive' or 'negative'.
Language: chinese, acc: 21.90%, prompt: Please answer whether the emotion of the sentence is 'positive' or 'negative' and categorize it.
Language: chinese, acc: 34.75%, prompt: Categorize the statement as 'positive' or 'negative' based on sentiment analysis.
Language: chinese, acc: 13.65%, prompt: Please judge this text as 'positive' or 'negative' according to the criteria of sentiment analysis.
Language: chinese, acc: 45.53%, prompt: Please classify this sentence as 'positive' or 'negative' according to the principles of emotion analysis.
Language: french, acc: 38.07%, prompt: As a sentiment analysis tool, please answer with 'positive' or 'negative' to classify this statement.
Language: french, acc: 35.32%, prompt: Determine whether this phrase is 'positive' or 'negative' as a sentiment classification tool.
Language: french, acc: 30.28%, prompt: Identify the tone of this statement by choosing between 'positive' and 'negative' as a sentiment analysis tool.
Language: french, acc: 16.74%, prompt: Use sentiment analysis to classify this statement as 'positive' or 'negative'.
Language: french, acc: 26.83%, prompt: As a sentiment classification tool, please determine whether this statement is 'positive' or 'negative'.
Language: french, acc: 41.97%, prompt: Classify this sentence as 'positive' or 'negative' using sentiment analysis.
Language: french, acc: 29.93%, prompt: Choose between 'positive' or 'negative' to classify this statement as a sentiment analysis tool.
Language: french, acc: 27.64%, prompt: Identify the sentiment expressed in this statement by selecting 'positive' or 'negative' as a sentiment classification tool.
Language: french, acc: 43.58%, prompt: Determine whether this phrase is 'positive' or 'negative' using sentiment analysis as a classification tool.
Language: french, acc: 16.74%, prompt: Use sentiment analysis to classify this statement as 'positive' or 'negative'.
Language: arabic, acc: 34.75%, prompt: Under emotional analysis, answer 'positive' or 'negative' to classify this statement.
Language: arabic, acc: 33.60%, prompt: Does this statement express a 'positive' or 'negative' reaction?
Language: arabic, acc: 27.64%, prompt: Is that a 'positive' or a 'negative' phrase?
Language: arabic, acc: 28.21%, prompt: What is the classification between 'positive' and 'negative'?
Language: arabic, acc: 28.10%, prompt: Does this sentence express 'positive' or 'negative' feelings?
Language: arabic, acc: 41.06%, prompt: In the context of textual analysis, what classification is this phrase between 'positive' and 'negative'?
Language: arabic, acc: 32.00%, prompt: Could this be classified as 'positive' or 'negative'?
Language: arabic, acc: 44.27%, prompt: In the context of emotional analysis, what classification is this statement between 'positive' and 'negative'?
Language: arabic, acc: 32.80%, prompt: Can this be classified as 'positive' or 'negative'?
Language: arabic, acc: 29.13%, prompt: Under the classification of emotions, is this sentence 'positive' or 'negative'?
Language: spanish, acc: 34.52%, prompt: As a feeling analysis tool, classify this statement as 'positive' or 'negative'.
Language: spanish, acc: 33.26%, prompt: Determine whether this statement has a 'positive' or 'negative' connotation.
Language: spanish, acc: 50.34%, prompt: Indicate whether the following statement is 'positive' or 'negative'.
Language: spanish, acc: 38.53%, prompt: Evaluate whether this text has a 'positive' or 'negative' emotional charge.
Language: spanish, acc: 14.11%, prompt: According to your sentiment analysis, would you say this comment is 'positive' or 'negative'?
Language: spanish, acc: 16.97%, prompt: In the context of sentiment analysis, label this sentence as 'positive' or 'negative'.
Language: spanish, acc: 38.30%, prompt: Rate the following statement as 'positive' or 'negative', according to your sentiment analysis.
Language: spanish, acc: 19.04%, prompt: How would you classify this text in terms of its emotional tone? 'positive' or 'negative'?
Language: spanish, acc: 24.08%, prompt: As a tool for sentiment analysis, would you say this statement is 'positive' or 'negative'?
Language: spanish, acc: 40.60%, prompt: Classify this statement as 'positive' or 'negative', please.
Language: japanese, acc: 24.08%, prompt: Treat this sentence as an emotion analysis tool and categorize it as 'positive' and 'negative'.
Language: japanese, acc: 30.50%, prompt: Use this article as a sentiment analysis tool to classify 'positive' and 'negative'.
Language: japanese, acc: 41.28%, prompt: Use this sentence as an emotion analysis tool to determine whether it is 'positive' or 'negative'.
Language: japanese, acc: 30.28%, prompt: Use this sentence as an emotion analysis tool to classify 'positive' and 'negative'.
Language: japanese, acc: 32.80%, prompt: Use this sentence as a sentiment analysis tool and classify it as 'positive' or 'negative'.
Language: japanese, acc: 14.56%, prompt: To classify this sentence as 'positive' or 'negative', evaluate it as a sentiment analysis tool.
Language: japanese, acc: 35.78%, prompt: Treat this sentence as an emotion analysis tool to determine whether it is 'positive' or 'negative'.
Language: japanese, acc: 21.79%, prompt: Use this sentence as a sentiment analysis tool to classify 'positive' and 'negative'.
Language: japanese, acc: 40.14%, prompt: Analyze this sentence as an emotion analysis tool to classify whether it is 'positive' or 'negative'.
Language: japanese, acc: 36.35%, prompt: Use this sentence as an emotional analysis tool to determine whether it is 'positive' or 'negative'.
Language: korean, acc: 34.17%, prompt: As an emotional analysis tool, respond with 'positive' or 'negative' to classify these sentences.
Language: korean, acc: 39.79%, prompt: Classify this sentence as 'positive' if you regard it as positive, 'negative' if you regard it as negative.
Language: korean, acc: 9.29%, prompt: Please rate the emotion of this sentence and classify it as 'positive' or 'negative'.
Language: korean, acc: 46.79%, prompt: Classify this sentence as 'positive' if you perceive it positively and 'negative' if you perceive it negatively.
Language: korean, acc: 42.32%, prompt: If this is a sentence delivered using a positive expression, classify it as 'positive' and if this is a sentence delivered using a negative expression, classify it as 'negative'.
Language: korean, acc: 30.96%, prompt: Respond with 'positive' or 'negative' by categorizing whether the sentence is positive or negative.
Language: korean, acc: 46.90%, prompt: Please analyze the emotion in this sentence and classify it as 'positive' or 'negative'.
Language: korean, acc: 42.55%, prompt: Classify this sentence as 'positive' if it contains a positive meaning, 'negative' if it contains a negative meaning.
Language: korean, acc: 40.02%, prompt: Classify this sentence as 'positive' if it contains positive content, 'negative' if it contains negative content.
Language: korean, acc: 49.89%, prompt: Classify what you're trying to convey in this sentence as 'positive' if it's positive, and 'negative' if it's negative.

# wnli

## 10 prompts

Acc: 43.66%, prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Acc: 43.66%, prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Acc: 38.03%, prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'.
Acc: 38.03%, prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'.
Acc: 33.80%, prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'.
Acc: 33.80%, prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'.
Acc: 33.80%, prompt: Indicate if the connection between the following sentences is 'entailment' or 'not_entailment'.
Acc: 28.17%, prompt: Review the two statements and categorize their relationship as either 'entailment' or 'not_entailment'.
Acc: 7.04%, prompt: Analyze the two provided sentences and decide if their relationship is 'entailment' or 'not_entailment'.
Acc: 7.04%, prompt: Considering the two sentences, identify if their relationship is 'entailment' or 'not_entailment'.
Acc: 4.23%, prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word.

Acc: 43.66%, prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment':
Acc: 43.66%, prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Acc: 43.66%, prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment':
Acc: 43.66%, prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Acc: 43.66%, prompt: In the capacity of an entailment assessment system, indicate if the link between the following sentences is 'entailment' or 'not_entailment':
Acc: 43.66%, prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':
Acc: 43.66%, prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment':
Acc: 39.44%, prompt: As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respond with 'entailment' or 'not_entailment':
Acc: 32.39%, prompt: As a tool for determining entailment relationships, review the two statements and categorize their connection as either 'entailment' or 'not_entailment':
Acc: 23.94%, prompt: While performing entailment analysis, classify the relationship between the provided sentences as 'entailment' or 'not_entailment':

## bertattack

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"i the relationship between the given terms represent entailment or not_entailment? reply with 'entailment' or 'not_entailment'."
Original acc: 43.66%, attacked acc: 18.31%, dropped acc: 25.35%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Determine if the first pair of sentences demonstrate entailment or not_entailment. replied with 'entailment' or 'not_entailment'."
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'.
Attacked prompt: b"Assess the connection between the after expressions and classify it as 'entailment' or 'not_entailment'."
Original acc: 38.03%, attacked acc: 9.86%, dropped acc: 28.17%

Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment':
Attacked prompt: b"In your role as an entailment analysis tool, assessing the relationship between the formed sentences and classify it as 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"is an entailment identification system, examine the connection between the before sentence and respond with 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 1.41%, dropped acc: 42.25%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment':
Attacked prompt: b"Functioning as an entailment evaluation tool, analyze the existing sentence and see if their closeness is 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 1.41%, dropped acc: 42.25%

## checklist

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment' MSAddJ4D2a."
Original acc: 43.66%, attacked acc: 33.80%, dropped acc: 9.86%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment' Vq3aBzuZcD."
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'.
Attacked prompt: b"Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment' AMsRIKZniY."
Original acc: 38.03%, attacked acc: 8.45%, dropped acc: 29.58%

Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment':
Attacked prompt: b"In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment' vTAjHynoIG:"
Original acc: 43.66%, attacked acc: 2.82%, dropped acc: 40.85%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment' yh912BU8T0:"
Original acc: 43.66%, attacked acc: 12.68%, dropped acc: 30.99%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment':
Attacked prompt: b"Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment' 9vGXgnbWB8:"
Original acc: 43.66%, attacked acc: 2.82%, dropped acc: 40.85%

## deepwordbug

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Deos the relationship between the gDiven sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'."
Original acc: 43.66%, attacked acc: 18.31%, dropped acc: 25.35%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"VDetermine if the giPen Apair of setences displUays entailment or not_entailment. QRespond with 'entailment' or 'not_entailment'."
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'.
Attacked prompt: b"Assess the fconnection between the following sentenecs and classify it as 'entailment' or 'not_entailment'."
Original acc: 38.03%, attacked acc: 16.90%, dropped acc: 21.13%

Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment':
Attacked prompt: b"kn your rohe as an entailment analysis tool, Zassess the relationship between the gUiven saentences and clasFsify it as 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"sA an entailment identification ysstem, examine the connection between the folBowing qsentences and respond with 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 9.86%, dropped acc: 33.80%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment':
Attacked prompt: b"Functioning as an entailment evaluation tOool, anatlyze the ppovided sentences and deWide if their relationsVhip is 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

## stresstest

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'  and false is not true ."
Original acc: 43.66%, attacked acc: 30.99%, dropped acc: 12.68%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true ."
Original acc: 43.66%, attacked acc: 32.39%, dropped acc: 11.27%

Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'.
Attacked prompt: b"Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true ."
Original acc: 38.03%, attacked acc: 14.08%, dropped acc: 23.94%

Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment':
Attacked prompt: b"In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true :"
Original acc: 43.66%, attacked acc: 39.44%, dropped acc: 4.23%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true :"
Original acc: 43.66%, attacked acc: 33.80%, dropped acc: 9.86%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment':
Attacked prompt: b"Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment'  and true is true  and true is true  and true is true  and true is true  and true is true :"
Original acc: 43.66%, attacked acc: 32.39%, dropped acc: 11.27%

## textbugger

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Deos the relationship between the given sentences represent entailment or not_entailment? Answering with 'entailment' or 'not_entailment'."
Original acc: 43.66%, attacked acc: 1.41%, dropped acc: 42.25%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Determ\xd1\x96ne if the given pair of sentencing displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'."
Original acc: 43.66%, attacked acc: 18.31%, dropped acc: 25.35%

Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'.
Attacked prompt: b"Assess the connect\xd1\x96on between the following sentence and classify it as 'entailment' or 'not_entailment'."
Original acc: 38.03%, attacked acc: 9.86%, dropped acc: 28.17%

Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment':
Attacked prompt: b"During your ro\xe2\x85\xbce as an entailment analysis tool, evaluating the relationship between the given sentences and classifying it as 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"A\xd1\x95 an entailment identification systems, examine the connection between the \xf0\x9d\x9a\x8following sentence and resp ond with 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment':
Attacked prompt: b"Functioning as an entailment evaluation tool, analyzes the provided sentences and decid\xd0\xb5 if their relationship is 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

## textfooler

Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Ca the relationship between the given sentences are entailment or not_entailment? Answer with 'entailment' or 'not_entailment'."
Original acc: 43.66%, attacked acc: 2.82%, dropped acc: 40.85%

Original prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'.
Attacked prompt: b"Define if the given pair of priors exhibits entailment or not_entailment. Answers with 'entailment' or 'not_entailment'."
Original acc: 43.66%, attacked acc: 1.41%, dropped acc: 42.25%

Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'.
Attacked prompt: b"Assess the attaching between the afterward judging and classify it as 'entailment' or 'not_entailment'."
Original acc: 38.03%, attacked acc: 2.82%, dropped acc: 35.21%

Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment':
Attacked prompt: b"Onto your role as an entailment analysing tool, appraisals the relationship between the given condemnation and classed it as 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment':
Attacked prompt: b"Because an entailment identification plans, examine the connection between the later conviction and meet with 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment':
Attacked prompt: b"Operate as an entailment evaluation tool, analyze the submitted sentences and choices if their relationship is 'entailment' or 'not_entailment':"
Original acc: 43.66%, attacked acc: 0.00%, dropped acc: 43.66%

## translation

Language: chinese, acc: 42.25%, prompt: In the light of an implication analysis tool, evaluate the relationship between the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 38.03%, prompt: From the perspective of an implication analysis tool, determine whether there is an implication relationship in the following sentences by classifying them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 33.80%, prompt: Please use an implication analysis tool to determine whether an implication relationship exists in the following sentences by classifying them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 23.94%, prompt: Please evaluate the relation of the following sentences as 'entailment' or 'not_entailment' for the purpose of determining implication relation.
Language: chinese, acc: 35.21%, prompt: Please use the implication analysis tool to evaluate the relationships between the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 28.17%, prompt: For the purpose of determining implicative relations, analyze the relations of the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 43.66%, prompt: Please use the implication analysis tool to determine the relationship of the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 19.72%, prompt: Please use the implication judgment tool to assess the relevance of the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 16.90%, prompt: Please, with implication analysis as the main task, determine the relationships between the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: chinese, acc: 25.35%, prompt: Using the implication judgment as a criterion, analyze the relation of the following sentences and classify them as 'entailment' or 'not_entailment'.
Language: french, acc: 35.21%, prompt: As an engagement analysis tool, evaluate the relationship between the given sentences and classify it as 'entailment' or 'not_entailment'.
Language: french, acc: 21.13%, prompt: Determine whether the given sentences involve one another or not as an implication analysis tool. Classify them accordingly as 'entailment' or 'not_entailment'.
Language: french, acc: 23.94%, prompt: Using implication analysis, evaluate whether the sentences provided have a logical relationship and categorize them as 'entailment' or 'not_entailment'.
Language: french, acc: 23.94%, prompt: As an engagement assessment tool, determine whether the sentences provided have a logical relationship and classify them as 'entailment' or 'not_entailment'.
Language: french, acc: 23.94%, prompt: As an implication classification tool, analyze the sentences provided to determine if there is a logical relationship and categorize them as 'entailment' or 'not_entailment'.
Language: french, acc: 29.58%, prompt: Using implication analysis, determine whether the given sentences have a cause-effect relationship and categorize them as 'entailment' or 'not_entailment'.
Language: french, acc: 39.44%, prompt: Evaluate the relationship between the given sentences using implication analysis and rank them accordingly as 'entailment' or 'not_entailment'.
Language: french, acc: 19.72%, prompt: As an engagement detection tool, determine whether the given sentences have a logical relationship and categorize them as 'entailment' or 'not_entailment'.
Language: french, acc: 18.31%, prompt: Using implication analysis, evaluate whether the sentences provided have a cause-effect relationship and rank them accordingly as 'entailment' or 'not_entailment'.
Language: french, acc: 5.63%, prompt: Determine whether the given sentences have a cause-effect relationship as an engagement analysis tool and categorize them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 33.80%, prompt: In your role as a tool for reasoning analysis, evaluate the relationship between given sentences and classify them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 39.44%, prompt: Can you determine whether this sentence is inferred from the other sentence? Classify it as 'entailment' or 'not_entailment'.
Language: arabic, acc: 28.17%, prompt: Using the tool of reasoning analysis, analyze the relationship between given sentences and classify them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 39.44%, prompt: Does this sentence represent a conclusion from the previous sentence? Classify it as 'entailment' or 'not_entailment'.
Language: arabic, acc: 25.35%, prompt: As a tool of reasoning analysis, evaluate the relationship of given sentences and classify them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 43.66%, prompt: Can this sentence be inferred from the previous sentence? Classify it as 'entailment' or 'not_entailment'.
Language: arabic, acc: 32.39%, prompt: Using a tool to analyze a conclusion, analyze the relationship between the two sentences and classify them as 'entailment' or 'not_entailment'.
Language: arabic, acc: 35.21%, prompt: Is this a conclusion from the next sentence? Classify it as 'entailment' or 'not_entailment'.
Language: arabic, acc: 33.80%, prompt: As part of your task in analyzing a conclusion, evaluate the relationship between the two sentences and classify them as 'entailment' or 'not_entailment' based on their relationship.
Language: arabic, acc: 28.17%, prompt: Are you following this sentence directly from the previous one? Classify it as 'entailment' or 'not_entailment'.
Language: spanish, acc: 36.62%, prompt: In your role as an implication analysis tool, evaluate the relationship between the given phrases and classify them as 'entailment' or 'not_entailment'.
Language: spanish, acc: 40.85%, prompt: Determine whether the second sentence necessarily implies the first and label the relation as 'entailment', or as 'not_entailment' if not.
Language: spanish, acc: 14.08%, prompt: Classifies the relationship between these two sentences as 'entailment' if one necessarily implies the other, or as 'not_entailment' if not.
Language: spanish, acc: 15.49%, prompt: Evaluates whether the information in the second sentence is implied in the first and labels the relationship as 'entailment', or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 32.39%, prompt: Given a couple of phrases, label their relationship as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 33.80%, prompt: Analyzes the relationship between the phrases and classifies them as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 40.85%, prompt: Given two sentences, determine whether the second sentence is a necessary consequence of the first and label the relation as 'entailment', or as 'not_entailment' if not.
Language: spanish, acc: 21.13%, prompt: Evaluates whether the information presented in the second sentence is implicit in the first and labels the relationship as 'entailment', or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 18.31%, prompt: Classifies the relationship between the given phrases as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication.
Language: spanish, acc: 19.72%, prompt: Determines whether the information provided in the second sentence is necessarily inferable from the first and labels the relationship as 'entailment', or as 'not_entailment' if not.
Language: japanese, acc: 12.68%, prompt: Analyze the relationship of a given sentence and classify it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 14.08%, prompt: Evaluate the semantic relationship of the sentence and classify it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 33.80%, prompt: Please judge the relationship between the given sentences and classify them as 'entailment' or 'not_entailment'.
Language: japanese, acc: 16.90%, prompt: Examine the paraphrases of a given sentence and classify them 'entailment' or 'not_entailment'.
Language: japanese, acc: 16.90%, prompt: Rate the similarity of a given sentence and categorize it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 19.72%, prompt: Determinate the semantic connections of a given sentence and classify it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 14.08%, prompt: Examine the semantic match of a given sentence and categorize it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 26.76%, prompt: Classify it as 'entailment' or 'not_entailment' based on the content of the sentence.
Language: japanese, acc: 12.68%, prompt: Analyze the relationship of a given sentence and classify it as 'entailment' or 'not_entailment'.
Language: japanese, acc: 22.54%, prompt: Judge the semantic connections of a given sentence and categorize it as 'entailment' or 'not_entailment'.
Language: korean, acc: 25.35%, prompt: Evaluate the relationship between any two sentences given to you and classify you as 'entailment' or 'not_entailment'.
Language: korean, acc: 18.31%, prompt: Analyze the semantic deductive relations between sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 12.68%, prompt: Evaluate the logical relevance between sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 36.62%, prompt: Evaluate the interaction of two given sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 7.04%, prompt: Please check whether there is a semantic match between those two sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 19.72%, prompt: Compare information between given sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 11.27%, prompt: Please analyse the correlation between those two sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 11.27%, prompt: Evaluate the different meanings between given sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 15.49%, prompt: Compare the semantic structure of the two sentences and classify them as 'entailment' or 'not_entailment'.
Language: korean, acc: 9.86%, prompt: Evaluate the interactions between sentences and classify them as 'entailment' or 'not_entailment'.

# mmlu

## 10 prompts

Acc: 27.07%, prompt: For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'.
Acc: 25.67%, prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'.
Acc: 25.09%, prompt: Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D'.
Acc: 24.85%, prompt: In relation to the multiple-choice question on {}, please provide the accurate answer by choosing 'A', 'B', 'C', or 'D'.
Acc: 23.69%, prompt: For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D'.
Acc: 22.29%, prompt: Regarding the following multiple-choice question on {}, pick the correct answer from the options 'A', 'B', 'C', or 'D'.
Acc: 21.82%, prompt: Answer the following multiple-choice question about {} by selecting the correct option: 'A', 'B', 'C', or 'D'.
Acc: 21.59%, prompt: Below are multiple-choice question concerning {}. Indicate your response with 'A', 'B', 'C', or 'D'.
Acc: 18.67%, prompt: Examine the following question based on {} and choose the correct response from 'A', 'B', 'C', or 'D'.
Acc: 16.10%, prompt: Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D'.

Acc: 25.09%, prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Acc: 24.85%, prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'.
Acc: 24.39%, prompt: With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Acc: 23.10%, prompt: Drawing upon your mastery of {}, please answer the multiple-choice question by selecting the correct option from 'A', 'B', 'C', or 'D'.
Acc: 22.75%, prompt: Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'.
Acc: 22.05%, prompt: Utilizing your expertise in {}, answer the following multiple-choice question by picking 'A', 'B', 'C', or 'D'.
Acc: 21.94%, prompt: With your understanding of {}, kindly answer the subsequent multiple-choice question by selecting 'A', 'B', 'C', or 'D'.
Acc: 21.59%, prompt: As a skilled person in the field of {}, please respond to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Acc: 19.37%, prompt: Considering your familiarity with {}, attend to the following multiple-choice question by picking 'A', 'B', 'C', or 'D'.
Acc: 14.12%, prompt: As someone well-versed in {}, please address the multiple-choice question below by selecting 'A', 'B', 'C', or 'D'.

## bertattack

Original prompt: For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"a each multiple-a instruction about {}, guess the accurate answer by selecting 'A', 'B', 'C', or 'D'. "
Original acc: 27.07%, attacked acc: 16.34%, dropped acc: 10.74%

Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Evaluate the multiple-r asked about {} and select the most better sentence from 'A', 'B', 'C', or 'D'. "
Original acc: 25.67%, attacked acc: 12.14%, dropped acc: 13.54%

Original prompt: Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Answer the next multiple-s question about {} by picking the incorrect button or 'A', 'B', 'C', or 'D'. "
Original acc: 25.09%, attacked acc: 5.02%, dropped acc: 20.07%

Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As a knowledgeaware individual in {}, state your response to the multiple-guessing round by choosing 'A', 'B', 'C', or 'D'."
Original acc: 25.09%, attacked acc: 9.68%, dropped acc: 15.40%

Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As an expert in {}, answer to the following multiple-guessing contestant by guessing 'A', 'B', 'C', or 'D'."
Original acc: 24.85%, attacked acc: 6.88%, dropped acc: 17.97%

Original prompt: With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"With your understanding of {}, tackle the forth multiple-asked query by choosing 'A', 'B', 'C', or 'D'."
Original acc: 24.39%, attacked acc: 8.28%, dropped acc: 16.10%

## checklist

Original prompt: For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D' iia2yL9Uzm. "
Original acc: 27.07%, attacked acc: 14.00%, dropped acc: 13.07%

Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D' iia2yL9Uzm. "
Original acc: 25.67%, attacked acc: 10.74%, dropped acc: 14.94%

Original prompt: Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D' iia2yL9Uzm. "
Original acc: 25.09%, attacked acc: 13.89%, dropped acc: 11.20%

Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D' iia2yL9Uzm."
Original acc: 25.09%, attacked acc: 11.79%, dropped acc: 13.30%

Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D' iia2yL9Uzm."
Original acc: 24.85%, attacked acc: 13.42%, dropped acc: 11.44%

Original prompt: With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D' iia2yL9Uzm."
Original acc: 24.39%, attacked acc: 11.09%, dropped acc: 13.30%

## deepwordbug

Original prompt: For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"For each multiple-chRoice qusetion about {}, cdentify the corret ansKwer by seleting 'A', 'B', 'C', or 'D'. "
Original acc: 27.07%, attacked acc: 15.87%, dropped acc: 11.20%

Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Evaluate the multple-choice question about {} and elect the most fitting respone from 'A', 'B', 'C', or 'D'. "
Original acc: 25.67%, attacked acc: 14.70%, dropped acc: 10.97%

Original prompt: Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Answer the subsequent multiple-choicSe qustion about {} by picking the right option among 'A', 'B', 'C', or 'D'. "
Original acc: 25.09%, attacked acc: 19.02%, dropped acc: 6.07%

Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As a kAnowledgeable indviidual in {}, provide your ersponse to the multipole-choice quesxion by choosing 'A', 'B', 'C', or 'D'."
Original acc: 25.09%, attacked acc: 16.45%, dropped acc: 8.63%

Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As an expert in {}, respnod to the following muxltiple-choice questoin by selecting 'A', 'B', 'C', or 'D'."
Original acc: 24.85%, attacked acc: 19.60%, dropped acc: 5.25%

Original prompt: With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"With your knowledge of {}, tackle the fololwing multpile-choice question by choosing 'A', 'B', 'C', or 'D'."
Original acc: 24.39%, attacked acc: 20.30%, dropped acc: 4.08%

## stresstest

Original prompt: For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 27.07%, attacked acc: 21.94%, dropped acc: 5.13%

Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 25.67%, attacked acc: 20.19%, dropped acc: 5.48%

Original prompt: Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D'  and true is true  and true is true  and true is true  and true is true  and true is true . "
Original acc: 25.09%, attacked acc: 19.49%, dropped acc: 5.60%

Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'  and true is true  and true is true  and true is true  and true is true  and true is true ."
Original acc: 25.09%, attacked acc: 17.39%, dropped acc: 7.70%

Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'  and true is true  and true is true  and true is true  and true is true  and true is true ."
Original acc: 24.85%, attacked acc: 13.19%, dropped acc: 11.67%

Original prompt: With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D'  and true is true  and true is true  and true is true  and true is true  and true is true ."
Original acc: 24.39%, attacked acc: 12.72%, dropped acc: 11.67%

## textbugger

Original prompt: For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"For each mul\xf0\x9d\x9a\x9diple-choice problem about {}, identify the corrects answer by choose 'A', 'B', 'C', or 'D'. "
Original acc: 27.07%, attacked acc: 17.27%, dropped acc: 9.80%

Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Assess the mul\xf0\x9d\x9a\x9diple-choice question about {} and se\xe2\x85\xbcect the most fitting response from 'A', 'B', 'C', or 'D'. "
Original acc: 25.67%, attacked acc: 9.92%, dropped acc: 15.75%

Original prompt: Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Answer the subs\xd0\xb5quent mu\xe2\x85\xbctiple-choice question about {} by pick the right alternative among 'A', 'B', 'C', or 'D'. "
Original acc: 25.09%, attacked acc: 18.67%, dropped acc: 6.42%

Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"A\xd1\x95 a informed individual in {}, provide your replies to the multiple-c\xd5\xb0oice ques\xf0\x9d\x9a\x9dion by choosing 'A', 'B', 'C', or 'D'."
Original acc: 25.09%, attacked acc: 13.30%, dropped acc: 11.79%

Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As an expert in {}, answering to the \xf0\x9d\x9a\x8following multiple-c\xd5\xb0oice \xd4\x9buestion by selecting 'A', 'B', 'C', or 'D'."
Original acc: 24.85%, attacked acc: 11.67%, dropped acc: 13.19%

Original prompt: With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"With your knowledge of {}, tackle the \xf0\x9d\x9a\x8following multiple-cho\xd1\x96ce problem by choosing 'A', 'B', 'C', or 'D'."
Original acc: 24.39%, attacked acc: 9.57%, dropped acc: 14.82%

## textfooler

Original prompt: For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Among all multiple-choice subject about {}, find the accurate answer by chooses 'A', 'B', 'C', or 'D'. "
Original acc: 27.07%, attacked acc: 12.72%, dropped acc: 14.35%

Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Evaluate the multiple-choice tura about {} and decide the most fitting replying from 'A', 'B', 'C', or 'D'. "
Original acc: 25.67%, attacked acc: 15.29%, dropped acc: 10.39%

Original prompt: Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D'.
Attacked prompt: b"Answer the next multiple-choice problem about {} by picking the right opt among 'A', 'B', 'C', or 'D'. "
Original acc: 25.09%, attacked acc: 15.17%, dropped acc: 9.92%

Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As a seasoned individual in {}, provide your response to the multiple-choice things by choosing 'A', 'B', 'C', or 'D'."
Original acc: 25.09%, attacked acc: 14.12%, dropped acc: 10.97%

Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'.
Attacked prompt: b"As an expert in {}, answering to the following multiple-choice things by picking 'A', 'B', 'C', or 'D'."
Original acc: 24.85%, attacked acc: 13.89%, dropped acc: 10.97%

Original prompt: With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D'.
Attacked prompt: b"With your knowledgeable of {}, adress the following multiple-choice things by picking 'A', 'B', 'C', or 'D'."
Original acc: 24.39%, attacked acc: 15.40%, dropped acc: 8.98%

## translation

Language: chinese, acc: 21.63%, prompt: For the following multiple choice question about {}, please select the correct answer from 'A', 'B', 'C' or 'D'.
Language: chinese, acc: 28.55%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice question for {}.
Language: chinese, acc: 30.50%, prompt: Select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice question {}.
Language: chinese, acc: 22.16%, prompt: Please choose the correct answer from 'A', 'B', 'C' or 'D' according to the following multiple-choice questions about {}.
Language: chinese, acc: 20.39%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the {} multiple choice questions below.
Language: chinese, acc: 19.86%, prompt: The following is A multiple choice question about {}. Please select the correct answer from 'A', 'B', 'C' or 'D'.
Language: chinese, acc: 28.55%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice question {}.
Language: chinese, acc: 22.16%, prompt: Please choose the correct answer from 'A', 'B', 'C' or 'D' according to the following multiple-choice questions about {}.
Language: chinese, acc: 23.40%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice questions about {}.
Language: chinese, acc: 23.40%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice questions about {}.
Language: french, acc: 21.28%, prompt: For the following multiple choice question on {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: french, acc: 22.34%, prompt: This is a multiple choice question about {}. Select the correct answer from options 'A', 'B', 'C' or 'D'.
Language: french, acc: 23.23%, prompt: In the context of the multiple-choice question on {}, identify the correct answer from options 'A', 'B', 'C' or 'D'.
Language: french, acc: 18.26%, prompt: About the following question on {}, determine the correct answer from the choices 'A', 'B', 'C' or 'D'.
Language: french, acc: 20.92%, prompt: Carefully review the multiple-choice question regarding {}. Choose the correct answer from options 'A', 'B', 'C', or 'D'.
Language: french, acc: 25.53%, prompt: For the multiple-choice question for {}, indicate the correct answer from options 'A', 'B', 'C', or 'D'.
Language: french, acc: 20.39%, prompt: The next question is about {}. Select the correct answer from the choices 'A', 'B', 'C' or 'D'.
Language: french, acc: 23.05%, prompt: As part of the multiple-choice question on {}, choose the appropriate answer from options 'A', 'B', 'C' or 'D'.
Language: french, acc: 18.26%, prompt: Rate your understanding of the multiple-choice question on {}. Choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: french, acc: 25.71%, prompt: Analyze the following multiple-choice question on {}. Identify the correct answer among choices 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 23.23%, prompt: For the multiple choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 22.87%, prompt: For the following multiple-choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 21.99%, prompt: For the following multiple choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 23.58%, prompt: When it comes to the multiple-choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 25.00%, prompt: For the multiple-choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 19.50%, prompt: If the question for {} is multiple choice, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 21.28%, prompt: For the question regarding {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 21.10%, prompt: For the question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 20.21%, prompt: When it comes to the question regarding {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: arabic, acc: 21.28%, prompt: For the question regarding {}, choose the correct answer from options 'A', 'B', 'C' or 'D'.
Language: spanish, acc: 25.53%, prompt: For the following multiple-choice question about {}, choose the correct answer from 'A', 'B', 'C', or 'D'.
Language: spanish, acc: 25.89%, prompt: For the following multiple-choice question about {}, select the correct answer from 'A', 'B', 'C', or 'D'.
Language: spanish, acc: 25.53%, prompt: For the following multiple-choice question about {}, choose the correct answer from 'A', 'B', 'C', or 'D'.
Language: spanish, acc: 25.35%, prompt: Within the context of the following multiple-choice question about {}, choose the correct option from 'A', 'B', 'C', or 'D'.
Language: spanish, acc: 25.00%, prompt: For the following multiple-choice statement about {}, select the correct answer from 'A', 'B', 'C', or 'D'.
Language: spanish, acc: 19.33%, prompt: Considering the following multiple-choice question about {}, mark the correct answer with 'A', 'B', 'C', or 'D'.
Language: spanish, acc: 22.87%, prompt: For the following multiple-choice question about {}, choose the correct alternative among 'A', 'B', 'C' or 'D'.
Language: spanish, acc: 24.47%, prompt: For the following multiple-choice statement about {}, choose the correct option from alternatives 'A', 'B', 'C', or 'D'.
Language: spanish, acc: 27.13%, prompt: Within the context of the following multiple-choice question about {}, select the correct answer from alternatives 'A', 'B', 'C', or 'D'.
Language: spanish, acc: 20.57%, prompt: Considering the following multiple-choice statement about {}, mark the correct alternative with the options 'A', 'B', 'C' or 'D'.
Language: japanese, acc: 21.28%, prompt: Choose the appropriate answer from options 'A', 'B', 'C', or 'D' for {} regarding the following question.
Language: japanese, acc: 24.29%, prompt: Choose the correct answer from 'A', 'B', 'C', or 'D' for the following multiple-choice question about {}.
Language: japanese, acc: 25.71%, prompt: For the following multiple-choice questions about {}, choose the correct answer from 'A', 'B', 'C', or 'D'.
Language: japanese, acc: 21.28%, prompt: Choose the correct answer from options 'A', 'B', 'C', or 'D' for the following questions about {}.
Language: japanese, acc: 19.86%, prompt: In the multiple choice questions below, choose the correct answer for {} from 'A', 'B', 'C', or 'D'.
Language: japanese, acc: 20.57%, prompt: Choose the correct answer from the options 'A', 'B', 'C', or 'D' for the following questions about {}.
Language: japanese, acc: 19.86%, prompt: In the multiple choice questions below, choose the correct answer for {} from 'A', 'B', 'C', or 'D'.
Language: japanese, acc: 22.52%, prompt: Choose the correct answer from 'A', 'B', 'C', or 'D' for the following multiple choice questions about {}.
Language: japanese, acc: 19.86%, prompt: In the multiple choice questions below, choose the correct answer for {} from 'A', 'B', 'C', or 'D'.
Language: japanese, acc: 21.99%, prompt: Choose the correct answer from options 'A', 'B', 'C', or 'D' for {} regarding the following question.
Language: korean, acc: 18.09%, prompt: For the multiple choice problem about, choose the correct answer for '{}' from 'A', 'B', 'C', or 'D'.
Language: korean, acc: 28.37%, prompt: Choose the correct answer for '{}' from 'A', 'B', 'C', or 'D' in the multiple choice problem involving,
Language: korean, acc: 21.99%, prompt: For the multiple choice problem below, choose the correct answer to '{}' from 'A', 'B', 'C', or 'D'.
Language: korean, acc: 24.82%, prompt: In the following multiple-choice problem, choose the correct answer for '{}' from 'A', 'B', 'C', or 'D'.
Language: korean, acc: 24.47%, prompt: For the following multiple choice problem, choose the correct answer for '{}' from 'A', 'B', 'C', or 'D'.
Language: korean, acc: 36.52%, prompt: Solve multiple choice problems about: Which of 'A', 'B', 'C', or 'D' is the correct answer for '{}'.
Language: korean, acc: 19.68%, prompt: Choose the correct answer to the multiple-choice question below. Is '{}' an 'A', 'B', 'C', or 'D'.
Language: korean, acc: 23.40%, prompt: Solve the following multiple-choice problem. Choose the correct answer for '{}' from 'A', 'B', 'C', or 'D'.
Language: korean, acc: 26.42%, prompt: Choose the correct answer to the following multiple choice problem: Is '{}' 'A', 'B', 'C', or 'D'.
Language: korean, acc: 31.74%, prompt: Solve multiple-choice problems about: Please select 'A', 'B', 'C', or 'D' for the correct answer to '{}'.

# squad_v2

## 10 prompts

## bertattack

## checklist

## deepwordbug

## stresstest

## textbugger

## textfooler

# un_multi

## 10 prompts

## bertattack

## checklist

## deepwordbug

## stresstest

## textbugger

## textfooler

# iwslt

## 10 prompts

## bertattack

## checklist

## deepwordbug

## stresstest

## textbugger

## textfooler

# math

## 10 prompts

## bertattack

## checklist

## deepwordbug

## stresstest

## textbugger

## textfooler