diff --git "a/adv_prompts/ul2_zeroshot.md" "b/adv_prompts/ul2_zeroshot.md" new file mode 100644--- /dev/null +++ "b/adv_prompts/ul2_zeroshot.md" @@ -0,0 +1,3191 @@ +# ul2_zeroshot + +# cola + +## 10 prompts + +Acc: 86.20%, prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable': +Acc: 86.10%, prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable': +Acc: 86.00%, prompt: Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable': +Acc: 85.90%, prompt: Determine if the grammar of the given sentence is 'Acceptable' or 'Unacceptable': +Acc: 85.80%, prompt: Analyze the provided sentence and classify its grammatical correctness as 'Acceptable' or 'Unacceptable': +Acc: 85.80%, prompt: Review the sentence below and identify whether its grammar is 'Acceptable' or 'Unacceptable': +Acc: 85.80%, prompt: Examine the sentence and decide if its grammar is 'Acceptable' or 'Unacceptable': +Acc: 85.50%, prompt: Check the grammar of the following sentence and indicate if it is 'Acceptable' or 'Unacceptable': +Acc: 85.30%, prompt: Please evaluate the grammatical structure of the provided sentence and answer with 'Acceptable' or 'Unacceptable': +Acc: 85.30%, prompt: Assess the grammatical structure of the given sentence and classify it as 'Acceptable' or 'Unacceptable': + +Acc: 86.20%, prompt: As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar: +Acc: 86.10%, prompt: In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect: +Acc: 86.00%, prompt: As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones: +Acc: 85.90%, prompt: In the capacity of a grammar assessment system, indicate if the structure of the provided sentence is grammatically correct, responding with 'acceptable' or 'unacceptable': +Acc: 85.80%, prompt: Functioning as a grammar evaluation tool, analyze the given sentence and decide if it is grammatically correct, responding with 'acceptable' or 'unacceptable': +Acc: 85.80%, prompt: Acting as a grammar detection instrument, determine if the provided sentence is grammatically sound, answering with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar: +Acc: 85.80%, prompt: As a tool for determining grammatical correctness, review the sentence and categorize its grammar as either 'acceptable' or 'unacceptable': +Acc: 85.80%, prompt: While performing grammar analysis, classify the grammar of the following sentence as 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar: +Acc: 85.70%, prompt: Working as a grammar classifier, identify whether the given sentence has correct grammar, and respond with 'acceptable' for correct sentences or 'unacceptable' for incorrect ones: +Acc: 85.70%, prompt: In the role of a syntax analyzer, examine the grammar of the provided sentence and decide if it is correct, answering with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones: + +## bertattack + +Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Assess the last sentence and determine if it is grammatical: perfect. arrive with 'Acceptable' or 'Unacceptable':" +Original acc: 86.20%, attacked acc: 30.10%, dropped acc: 56.10% + +Original prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Is the supplied verb grammaticalthe incorrect? speak with 'Acceptable' or 'Unacceptable':" +Original acc: 86.10%, attacked acc: 39.30%, dropped acc: 46.80% + +Original prompt: Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable': +Attacked prompt: b"find the particular sentence and look if it is grammaticaldefinitely present. responded with neither 'Acceptable' or 'Unacceptable':" +Original acc: 86.00%, attacked acc: 25.30%, dropped acc: 60.70% + +Original prompt: As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar: +Attacked prompt: b"be an instrument for category judgment, consider the grammar and choose if its morphology is incorrect, responding with 'acceptable' for correct verbs or 'unacceptable' for best grammar:" +Original acc: 86.20%, attacked acc: 33.00%, dropped acc: 53.20% + +Original prompt: In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect: +Attacked prompt: b"should your job as a grammar get check, assess the following sentence and classify it as 'acceptable' if it is grammaticalmore improper or 'unacceptable' if it is contradictory:" +Original acc: 86.10%, attacked acc: 30.90%, dropped acc: 55.20% + +Original prompt: As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones: +Attacked prompt: b"where a grammar a is, analyze the supplied sentence and object with 'acceptable' for grammaticalthey singular plural or 'unacceptable' for acceptable sentences:" +Original acc: 86.00%, attacked acc: 77.20%, dropped acc: 8.80% + +## checklist + +Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable' EAB4KP2NVY:" +Original acc: 86.10%, attacked acc: 72.20%, dropped acc: 13.90% + +Original prompt: Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable' EAB4KP2NVY:" +Original acc: 86.10%, attacked acc: 78.30%, dropped acc: 7.80% + +Original prompt: Determine if the grammar of the given sentence is 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Determine if the grammar of the given sentence is 'Acceptable' or 'Unacceptable' EAB4KP2NVY:" +Original acc: 85.90%, attacked acc: 66.10%, dropped acc: 19.80% + +Original prompt: As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar: +Attacked prompt: b"As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar bhrRSokrfa:" +Original acc: 86.30%, attacked acc: 55.10%, dropped acc: 31.20% + +Original prompt: In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect: +Attacked prompt: b"In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect bhrRSokrfa:" +Original acc: 86.10%, attacked acc: 55.00%, dropped acc: 31.10% + +Original prompt: As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones: +Attacked prompt: b"As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones bhrRSokrfa:" +Original acc: 86.00%, attacked acc: 75.20%, dropped acc: 10.80% + +## deepwordbug + +Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Assess the folowing senbence and determine if it is grammtically corerct. RespoGnd with 'Acceptable' or 'Unacceptable':" +Original acc: 86.10%, attacked acc: 85.10%, dropped acc: 1.00% + +Original prompt: Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Exmine the givedn sentenQe and decide if it is grmmatically souend. Answer with eithyer 'Acceptable' or 'Unacceptable':" +Original acc: 86.10%, attacked acc: 82.10%, dropped acc: 4.00% + +Original prompt: Determine if the grammar of the given sentence is 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Determinqe if the grmamar of the igven sentewnce is 'Acceptable' or 'Unacceptable':" +Original acc: 85.90%, attacked acc: 78.10%, dropped acc: 7.80% + +Original prompt: As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar: +Attacked prompt: b"As an instrumenat for grammfar evaluation, consider the sentence and deterine if its grammar is correct, responding with 'acceptable' for correc grammar or 'unacceptable' for ncorrect grammar:" +Original acc: 86.30%, attacked acc: 85.00%, dropped acc: 1.30% + +Original prompt: In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect: +Attacked prompt: b"In your role as a grammar check too, assess the following senttence and classify it as 'acceptable' if it is rgammatically correct or 'unacceptable' if it is incorroct:" +Original acc: 86.10%, attacked acc: 84.50%, dropped acc: 1.60% + +Original prompt: As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones: +Attacked prompt: b"As a grammar identification system, examine the provided sentence and respknd with 'acceptable' for gramatically correcG sentences or 'unacceptable' for incorrect ones:" +Original acc: 86.00%, attacked acc: 84.60%, dropped acc: 1.40% + +## stresstest + +Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable' and true is true and true is true and true is true and true is true and true is true :" +Original acc: 86.10%, attacked acc: 78.50%, dropped acc: 7.60% + +Original prompt: Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable' and false is not true :" +Original acc: 86.10%, attacked acc: 76.30%, dropped acc: 9.80% + +Original prompt: Determine if the grammar of the given sentence is 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Determine if the grammar of the given sentence is 'Acceptable' or 'Unacceptable' and false is not true :" +Original acc: 85.90%, attacked acc: 77.30%, dropped acc: 8.60% + +Original prompt: As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar: +Attacked prompt: b"As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar and true is true and true is true and true is true and true is true and true is true :" +Original acc: 86.30%, attacked acc: 61.00%, dropped acc: 25.30% + +Original prompt: In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect: +Attacked prompt: b"In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect and true is true and true is true and true is true and true is true and true is true :" +Original acc: 86.10%, attacked acc: 34.50%, dropped acc: 51.60% + +Original prompt: As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones: +Attacked prompt: b"As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones and true is true and true is true and true is true and true is true and true is true :" +Original acc: 86.00%, attacked acc: 51.50%, dropped acc: 34.50% + +## textbugger + +Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Assess the following sentence and identify if it is pronouns c\xd0\xberrect. Replying with 'Acceptable' or 'Unacceptable':" +Original acc: 86.20%, attacked acc: 78.70%, dropped acc: 7.50% + +Original prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Is the provided sentence pronouns rightness? Responding with 'Acceptable' or 'Unacceptable':" +Original acc: 86.10%, attacked acc: 79.80%, dropped acc: 6.30% + +Original prompt: Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Examine the given sentence and decide if it is pronouns audible. Answers with neither 'Acceptable' or 'Unacceptable':" +Original acc: 86.00%, attacked acc: 80.30%, dropped acc: 5.70% + +Original prompt: As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar: +Attacked prompt: b"Since an devices for grammar assessment, consid\xd0\xb5r the penalties and determines if its grammar is correcting, resp\xd0\xbending with 'acceptable' for corr\xd0\xb5ct gramamr or 'unacceptable' for wrong grammar:" +Original acc: 86.20%, attacked acc: 82.50%, dropped acc: 3.70% + +Original prompt: In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect: +Attacked prompt: b"Throughout your rle as a grammar check tool, asses\xd1\x95 the subsequent sentences and \xcf\xb2lassify it as 'acceptable' if it is pronouns correct or 'unacceptable' if it is erroneous:" +Original acc: 86.10%, attacked acc: 82.70%, dropped acc: 3.40% + +Original prompt: As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones: +Attacked prompt: b"A\xd1\x95 a grammar identificati\xd0\xben systems, examine the provided sentence and respo\xd5\xb8d with 'acceptable' for pronouns rightness sentence or 'unacceptable' for incorrect those:" +Original acc: 86.00%, attacked acc: 81.00%, dropped acc: 5.00% + +## textfooler + +Original prompt: Assess the following sentence and determine if it is grammatically correct. Respond with 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Assess the following sentencing and determine if it is grammatically mend. Retaliated with 'Acceptable' or 'Unacceptable':" +Original acc: 86.20%, attacked acc: 59.50%, dropped acc: 26.70% + +Original prompt: Is the provided sentence grammatically correct? Respond with 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Is the brought chastisement grammatically appropriate? Retaliated with 'Acceptable' or 'Unacceptable':" +Original acc: 86.10%, attacked acc: 30.90%, dropped acc: 55.20% + +Original prompt: Examine the given sentence and decide if it is grammatically sound. Answer with either 'Acceptable' or 'Unacceptable': +Attacked prompt: b"Analysing the given sentence and defining if it is grammatically salim. Retaliated with either 'Acceptable' or 'Unacceptable':" +Original acc: 86.00%, attacked acc: 46.60%, dropped acc: 39.40% + +Original prompt: As an instrument for grammar evaluation, consider the sentence and determine if its grammar is correct, responding with 'acceptable' for correct grammar or 'unacceptable' for incorrect grammar: +Attacked prompt: b"As an instrument for grammar appraisals, considerations the convicting and determine if its verbs is correct, responding with 'acceptable' for proper grammar or 'unacceptable' for misspelled vocabulary:" +Original acc: 86.20%, attacked acc: 33.10%, dropped acc: 53.10% + +Original prompt: In your role as a grammar check tool, assess the following sentence and classify it as 'acceptable' if it is grammatically correct or 'unacceptable' if it is incorrect: +Attacked prompt: b"Onto your roles as a grammar check tool, assess the later sentence and rank it as 'acceptable' if it is grammatically mend or 'unacceptable' if it is incorrect:" +Original acc: 86.10%, attacked acc: 85.00%, dropped acc: 1.10% + +Original prompt: As a grammar identification system, examine the provided sentence and respond with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones: +Attacked prompt: b"As a verbs identification systems, probe the provided sentence and reacting with 'acceptable' for grammatically correct sentences or 'unacceptable' for incorrect ones:" +Original acc: 86.00%, attacked acc: 84.00%, dropped acc: 2.00% + +## translation + +Language: chinese, acc: 86.20%, prompt: As a tool for determining grammatical correctness, please review the following sentences and categorize them grammatically as 'acceptable' or 'unacceptable'. +Language: chinese, acc: 85.70%, prompt: Please evaluate the grammar of the following sentences and mark them as 'acceptable' or 'unacceptable'. +Language: chinese, acc: 87.00%, prompt: Please check that the following sentences are grammatically correct and categorize them as 'acceptable' or 'unacceptable'. +Language: chinese, acc: 86.20%, prompt: Please determine whether the following sentences are 'acceptable' or 'unacceptable' based on grammatical correctness. +Language: chinese, acc: 86.20%, prompt: Please categorize the grammar of the following sentences as 'acceptable' or 'unacceptable'. +Language: chinese, acc: 85.30%, prompt: Please check the grammar of the following sentences and classify them as 'acceptable' or 'unacceptable'. +Language: chinese, acc: 86.40%, prompt: Please mark the following sentences as 'acceptable' or 'unacceptable' according to the grammatical norms. +Language: chinese, acc: 85.80%, prompt: Please assess the grammar of the following sentences and classify them as 'acceptable' or 'unacceptable'. +Language: chinese, acc: 86.60%, prompt: Please mark the following sentences as 'acceptable' or 'unacceptable' for grammatical accuracy. +Language: chinese, acc: 85.30%, prompt: Please check the grammar of the following sentences and classify them as 'acceptable' or 'unacceptable'. +Language: french, acc: 84.70%, prompt: As a tool for assessing semantic relevance, categorize the sentence as 'acceptable' or 'unacceptable' according to its meaning and context. +Language: french, acc: 86.20%, prompt: Determine whether the grammatical structure of the sentence is 'acceptable' or 'unacceptable' by carefully reviewing it. +Language: french, acc: 86.10%, prompt: Judge whether the sentence follows the syntactic rules of the language and classify it as 'acceptable' or 'unacceptable'. +Language: french, acc: 85.50%, prompt: Assess the relevance of the sentence by classifying it as 'acceptable' or 'unacceptable' according to the context in which it is used. +Language: french, acc: 85.10%, prompt: As a semantic classification tool, decide whether the sentence is 'acceptable' or 'unacceptable' according to its meaning and consistency. +Language: french, acc: 87.10%, prompt: Analyze the sentence carefully and determine whether it is grammatically 'acceptable' or 'unacceptable'. +Language: french, acc: 85.80%, prompt: As a syntactic validation tool, determine whether the sentence is properly structured and categorize it as 'acceptable' or 'unacceptable'. +Language: french, acc: 86.10%, prompt: When evaluating the grammar of the sentence, classify it as 'acceptable' or 'unacceptable' according to the rules of the French language. +Language: french, acc: 86.50%, prompt: Determine whether the sentence is 'acceptable' or 'unacceptable' depending on its syntactic construction and semantic context. +Language: french, acc: 85.80%, prompt: As a tool for linguistic analysis, categorize the sentence as 'acceptable' or 'unacceptable' according to its relevance and semantic consistency. +Language: arabic, acc: 85.80%, prompt: As a tool for determining grammatical correctness, review the sentence and classify its rules as 'acceptable' or 'unacceptable'. +Language: arabic, acc: 86.30%, prompt: Analyze the sentence carefully and classify its grammar between 'acceptable' or 'unacceptable'. +Language: arabic, acc: 85.80%, prompt: As a tool for determining grammatical correctness, classify the rules of the sentence between 'acceptable' or 'unacceptable'. +Language: arabic, acc: 86.20%, prompt: Evaluate the correctness of the sentence between 'acceptable' or 'unacceptable', as a tool for determining grammatical correctness. +Language: arabic, acc: 86.00%, prompt: As a tool for classifying valid sentences, determine whether 'acceptable' or 'unacceptable' is based on its grammatical rules. +Language: arabic, acc: 85.90%, prompt: Analyze the sentence carefully and classify its grammatical rules between 'acceptable' or 'unacceptable'. +Language: arabic, acc: 85.80%, prompt: Analyze the sentence and classify it between 'acceptable' or 'unacceptable' as a grammatical check tool. +Language: arabic, acc: 86.70%, prompt: As a classification tool for grammatical sentences, determine whether the sentence 'acceptable' or 'unacceptable' is based on its rules. +Language: arabic, acc: 86.30%, prompt: As a tool for analyzing grammar, classify the rules of the sentence between 'acceptable' or 'unacceptable'. +Language: arabic, acc: 85.60%, prompt: Determine whether the sentence is 'acceptable' or 'unacceptable' as a tool for determining grammatical correctness and analyze its grammar. +Language: spanish, acc: 85.60%, prompt: As a tool to determine grammatical correctness, review the sentence and categorize its grammar as 'acceptable' or 'unacceptable'. +Language: spanish, acc: 85.50%, prompt: Analyze the sentence carefully and classify its grammar as 'acceptable' or 'unacceptable'. +Language: spanish, acc: 84.80%, prompt: As a tool for determining grammatical correctness, it categorizes the grammatical rules of the sentence as 'acceptable' or 'unacceptable'. +Language: spanish, acc: 86.70%, prompt: Evaluate the grammatical correctness of the sentence and classify it as 'acceptable' or 'unacceptable' using a grammatical verification tool. +Language: spanish, acc: 86.60%, prompt: As a tool for classifying grammatically correct sentences, determine whether the sentence is 'acceptable' or 'unacceptable' based on its grammatical rules. +Language: spanish, acc: 85.50%, prompt: Analyse the sentence carefully and classify its grammar as 'acceptable' or 'unacceptable' using a grammatical revision tool. +Language: spanish, acc: 85.50%, prompt: As a tool for classifying grammatical sentences, it determines whether the sentence is 'acceptable' or 'unacceptable' using its grammatical structure. +Language: spanish, acc: 84.80%, prompt: As a tool for analyzing grammatical correctness, it categorizes the grammatical rules of the sentence as 'acceptable' or 'unacceptable'. +Language: spanish, acc: 86.20%, prompt: Determine whether the sentence is 'acceptable' or 'unacceptable' using a grammatical verification tool and sort it accordingly. +Language: spanish, acc: 85.20%, prompt: As a tool for determining grammatical correctness, evaluate the sentence and classify it as 'acceptable' or 'unacceptable' based on its grammatical rules. +Language: japanese, acc: 85.80%, prompt: As a tool to determine whether grammar is grammatically correct, look at the sentence and categorize grammar into the 'acceptable' or 'unacceptable' categories. +Language: japanese, acc: 86.00%, prompt: Please read the given sentence and categorize the grammar into the 'acceptable' or 'unacceptable' categories. +Language: japanese, acc: 85.10%, prompt: To determine the correctness of a sentence, categorize grammar into two categories: 'acceptable' or 'unacceptable'. +Language: japanese, acc: 85.80%, prompt: Classify the grammar of a given sentence into two categories: 'acceptable' and 'unacceptable'. +Language: japanese, acc: 86.90%, prompt: Classify a given sentence into the categories 'acceptable' or 'unacceptable' to determine whether it is grammatically correct. +Language: japanese, acc: 86.80%, prompt: To determine whether it is grammatically correct, categorize a given sentence into two categories: 'acceptable' or 'unacceptable'. +Language: japanese, acc: 85.30%, prompt: To determine the correctness of grammar, categorize a given sentence into two categories: 'acceptable' or 'unacceptable'. +Language: japanese, acc: 85.10%, prompt: Classify the grammar of a given sentence into two categories, 'acceptable' or 'unacceptable', and judge its accuracy. +Language: japanese, acc: 86.70%, prompt: To determine whether it is grammatically correct, divide a given sentence into two categories: 'acceptable' or 'unacceptable'. +Language: japanese, acc: 84.80%, prompt: To evaluate the accuracy of grammar, categorize a given sentence into two categories: 'acceptable' or 'unacceptable'. +Language: korean, acc: 86.20%, prompt: As a tool for judging grammatical correctness, please review the sentences and classify the grammar as 'acceptable' or 'unacceptable'. +Language: korean, acc: 86.00%, prompt: Please read the given sentences, and classify the grammar as 'acceptable' or 'unacceptable'. +Language: korean, acc: 86.70%, prompt: Please classify the sentences as 'acceptable' or 'unacceptable' to judge the grammatical appropriateness. +Language: korean, acc: 85.80%, prompt: Please classify the grammar of the given sentences into 2 categories: 'acceptable' or 'unacceptable'. +Language: korean, acc: 85.20%, prompt: Please classify the given sentences as 'acceptable' or 'unacceptable' to judge the correctness of the grammar. +Language: korean, acc: 86.20%, prompt: Please classify the given sentences as 'acceptable' or 'unacceptable' in order to judge their grammatical appropriateness. +Language: korean, acc: 85.30%, prompt: Please determine the correctness of your grammar by classifying sentences as 'acceptable' or 'unacceptable'. +Language: korean, acc: 85.40%, prompt: Classify the grammar of a given sentence as 'acceptable' or 'unacceptable', and judge its accuracy. +Language: korean, acc: 85.20%, prompt: Please classify the given sentences as 'acceptable' or 'unacceptable' to judge the correctness of the grammar. +Language: korean, acc: 85.20%, prompt: Please rate the accuracy of your grammar by categorizing sentences as 'acceptable' or 'unacceptable'. + +# mnli + +## 10 prompts + +Acc: 88.80%, prompt: Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction': +Acc: 87.40%, prompt: Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Acc: 87.20%, prompt: Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction': +Acc: 87.00%, prompt: Determine if the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction': +Acc: 80.00%, prompt: Indicate if the connection between the following sentences is 'entailment', 'neutral', or 'contradiction': +Acc: 79.90%, prompt: Considering the two sentences, identify if their relationship is 'entailment', 'neutral', or 'contradiction': +Acc: 79.80%, prompt: Review the two statements and categorize their relationship as either 'entailment', 'neutral', or 'contradiction': +Acc: 79.40%, prompt: Analyze the two provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction': +Acc: 79.10%, prompt: Please classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction': +Acc: 77.70%, prompt: Assess the connection between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction': + +Acc: 87.10%, prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Acc: 86.10%, prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Acc: 85.90%, prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction': +Acc: 78.20%, prompt: In the capacity of an entailment assessment system, indicate if the link between the following sentences is 'entailment', 'neutral', or 'contradiction': +Acc: 77.00%, prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment', 'neutral', or 'contradiction': +Acc: 76.10%, prompt: While performing entailment analysis, classify the relationship between the provided sentences as 'entailment', 'neutral', or 'contradiction': +Acc: 75.90%, prompt: As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment', 'neutral', or 'contradiction': +Acc: 75.80%, prompt: As a tool for determining entailment relationships, review the two statements and categorize their connection as either 'entailment', 'neutral', or 'contradiction': +Acc: 73.00%, prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment', 'neutral', or 'contradiction': +Acc: 71.40%, prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment', 'neutral', or 'contradiction': + +## bertattack + +Original prompt: Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"would the tie between the equivalent plural indicate entailment, neutral, or contradiction? contrast with 'entailment', 'neutral', or 'contradiction':" +Original acc: 88.80%, attacked acc: 73.70%, dropped acc: 15.10% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Identify neither the valid pairs of answers constitutes entailment, neutral, or contradiction. agree with 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.40%, attacked acc: 34.10%, dropped acc: 53.30% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"d the opposite of sentences and search if they represent entailment, neutral, or contradiction. give with neither 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.20%, attacked acc: 48.00%, dropped acc: 39.20% + +Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"consider the task of a morphology phrase analyzezer, find the similarities between the similar clauses and score if they exhibit entailment, neutral, or contradiction. begin with 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.10%, attacked acc: 74.20%, dropped acc: 12.90% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"on as an entailment marking instrument, determine if the given singular of verb has entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':" +Original acc: 86.10%, attacked acc: 73.50%, dropped acc: 12.60% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"you as an entailment classifier, identifies neither the a pair of sentences displays entailment, neutral, or contradiction. fill with 'entailment', 'neutral', or 'contradiction':" +Original acc: 85.90%, attacked acc: 35.20%, dropped acc: 50.70% + +## checklist + +Original prompt: Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction' d6ZQ3u0GBQ:" +Original acc: 88.80%, attacked acc: 88.30%, dropped acc: 0.50% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction' MSAddJ4D2a:" +Original acc: 87.30%, attacked acc: 87.00%, dropped acc: 0.30% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction' Cev0s49fbe:" +Original acc: 87.30%, attacked acc: 87.30%, dropped acc: 0.00% + +Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction' 9vGXgnbWB8:" +Original acc: 87.10%, attacked acc: 86.80%, dropped acc: 0.30% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction' MQnugHcaoy:" +Original acc: 86.10%, attacked acc: 86.30%, dropped acc: -0.20% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction' MQnugHcaoy:" +Original acc: 85.80%, attacked acc: 85.60%, dropped acc: 0.20% + +## deepwordbug + +Original prompt: Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Doew the relationsfhip between the given sentences rperesent entailment, neutral, or contradiction? tRespond with 'entailment', 'neutral', or 'contradiction':" +Original acc: 88.80%, attacked acc: 87.40%, dropped acc: 1.40% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.30%, attacked acc: 87.30%, dropped acc: 1.50% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Identif whethver the given pair of sentences deomnstrates entailment, neutral, or contradiction. Answe with 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.30%, attacked acc: 86.00%, dropped acc: 1.30% + +Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"In the role of a semantic relatonship analyzer, examine the connecOtion between the Ggiven sentences and dcide if they exhibit entailment, neutral, or contradiction. tnswer with 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.10%, attacked acc: 85.40%, dropped acc: 1.70% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':" +Original acc: 86.10%, attacked acc: 86.10%, dropped acc: 1.00% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Working as an entailment classifier, identKfy whehter the given pair of syntences dislays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction':" +Original acc: 85.80%, attacked acc: 84.80%, dropped acc: 1.00% + +## stresstest + +Original prompt: Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction' and true is true :" +Original acc: 88.80%, attacked acc: 89.50%, dropped acc: -0.70% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction' and false is not true :" +Original acc: 87.30%, attacked acc: 88.10%, dropped acc: -0.80% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction' and true is true and true is true and true is true and true is true and true is true :" +Original acc: 87.30%, attacked acc: 88.00%, dropped acc: -0.70% + +Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction' and false is not true :" +Original acc: 87.10%, attacked acc: 88.00%, dropped acc: -0.90% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction' and true is true :" +Original acc: 86.10%, attacked acc: 87.40%, dropped acc: -1.30% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction' and true is true :" +Original acc: 85.80%, attacked acc: 86.50%, dropped acc: -0.70% + +## textbugger + +Original prompt: Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Did the relations between the awarded sentences repreesnt entailment, neutral, or contradiction? Re\xd1\x95pond with 'entailment', 'neutral', or 'contradiction':" +Original acc: 88.80%, attacked acc: 86.50%, dropped acc: 2.30% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.40%, attacked acc: 87.40%, dropped acc: 1.40% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Investigated the pair of \xd1\x95entences and dtermine if they exhibition entailment, neutral, or contradiction. Anwser with neither 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.20%, attacked acc: 52.50%, dropped acc: 34.70% + +Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"In the role of a semantic nexus anlyzer, examines the con\xd5\xb8ection between the given sentences and decides if they exhibits entailment, neutral, or contradiction. Responding with 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.10%, attacked acc: 85.50%, dropped acc: 1.60% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':" +Original acc: 86.10%, attacked acc: 86.10%, dropped acc: 1.00% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Wo\xe2\xb2\x85king as an entailment classifier, detects \xd4\x9dhether the conferred pair of sentences displaying entailment, neutral, or contradiction. Re\xd1\x95pond with 'entailment', 'neutral', or 'contradiction':" +Original acc: 85.90%, attacked acc: 82.50%, dropped acc: 3.40% + +## textfooler + +Original prompt: Does the relationship between the given sentences represent entailment, neutral, or contradiction? Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Is the tying between the offered sentences represent entailment, neutral, or contradiction? Referee with 'entailment', 'neutral', or 'contradiction':" +Original acc: 88.80%, attacked acc: 81.30%, dropped acc: 7.50% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Identify whether the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.40%, attacked acc: 87.40%, dropped acc: 1.40% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment, neutral, or contradiction. Answer with either 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Examine the matches of phrases and establishes if they exhibit entailment, neutral, or contradiction. Reply with either 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.20%, attacked acc: 83.80%, dropped acc: 3.40% + +Original prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"For the feature of a semantic tying analyzer, probe the links between the given phrases and deciding if they exhibit entailment, neutral, or contradiction. Rebuttal with 'entailment', 'neutral', or 'contradiction':" +Original acc: 87.10%, attacked acc: 83.40%, dropped acc: 3.70% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment, neutral, or contradiction. Answer with 'entailment', 'neutral', or 'contradiction':" +Original acc: 86.10%, attacked acc: 86.10%, dropped acc: 1.00% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment, neutral, or contradiction. Respond with 'entailment', 'neutral', or 'contradiction': +Attacked prompt: b"Working as an entailment classifier, determines whether the given pair of sentences displays entailment, neutral, or contradiction. Referee with 'entailment', 'neutral', or 'contradiction':" +Original acc: 85.90%, attacked acc: 84.70%, dropped acc: 1.20% + +## translation + +Language: chinese, acc: 76.90%, prompt: As an implication analysis tool, evaluate the relationships between given sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: chinese, acc: 76.90%, prompt: As an implication analysis tool, evaluate the relationships between given sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: chinese, acc: 77.90%, prompt: Analyze the relationships between the given sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: chinese, acc: 79.60%, prompt: Evaluate the logical relationship between the following sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: chinese, acc: 77.80%, prompt: In your role as an implication analysis tool, judge the relationships of the following sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: chinese, acc: 79.70%, prompt: Please judge the relationship between the following sentences as 'entailment', 'neutral', or 'contradiction'. +Language: chinese, acc: 78.50%, prompt: From a given sentence, evaluate the relationship between them and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: chinese, acc: 76.50%, prompt: Please classify the following sentences as 'entailment', 'neutral', or 'contradiction' according to their relationships. +Language: chinese, acc: 79.60%, prompt: Evaluate the logical relationship between the following sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: chinese, acc: 78.20%, prompt: As an implication analysis tool, categorize the relationships between given sentences as 'entailment', 'neutral', or 'contradiction'. +Language: chinese, acc: 77.10%, prompt: Please classify the following sentences as 'entailment', 'neutral', or 'contradiction' according to their logical relationship. +Language: french, acc: 77.00%, prompt: As a tool for analyzing the consequence relationship, evaluate the relationship between the given sentences and classify it as 'entailment', 'neutral', or 'contradiction'. +Language: french, acc: 78.90%, prompt: Evaluate the relationship between the given sentences and classify it as 'entailment', 'neutral', or 'contradiction'. +Language: french, acc: 81.00%, prompt: Determine whether the following sentences are related to 'entailment', 'neutral', or 'contradiction'. +Language: french, acc: 76.10%, prompt: In your role as a consequence analysis tool, evaluate the relationship between the given sentences and classify it as 'entailment', 'neutral', or 'contradiction'. +Language: french, acc: 79.50%, prompt: Classify the relationship between the following sentences as 'entailment', 'neutral', or 'contradiction'. +Language: french, acc: 77.10%, prompt: As a consequence analysis tool, evaluate the relationship between the given sentences and classify it as 'entailment', 'neutral', or 'contradiction'. +Language: french, acc: 79.40%, prompt: Analyze the relationship between the given sentences and determine whether it is of 'entailment', 'neutral', or 'contradiction'. +Language: french, acc: 79.40%, prompt: Evaluate the relationship between the following sentences and classify it as 'entailment', 'neutral', or 'contradiction'. +Language: french, acc: 76.20%, prompt: As a tool for analyzing the consequence relationship, classify the following sentences as 'entailment', 'neutral', or 'contradiction'. +Language: french, acc: 80.50%, prompt: Determine whether the given sentences are related to 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 77.60%, prompt: Based on your role as a reasoning analyst, analyze the relationship between the given sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 78.70%, prompt: Evaluate the relationship between given sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 78.40%, prompt: Determine if the following sentences are 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 77.50%, prompt: In your role as a tool of reasoning analysis, investigate the relationship between sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 79.50%, prompt: Classify the relationship between the following sentences as 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 76.60%, prompt: In your role as a tool of reasoning analysis, evaluate the relationship between the given sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 78.90%, prompt: Analyze the relationship between the given sentences and determine if they are 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 79.20%, prompt: Evaluate the relationship between the following sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 76.00%, prompt: In your role as a tool of reasoning analysis, the following sentences are classified as 'entailment', 'neutral', or 'contradiction'. +Language: arabic, acc: 79.00%, prompt: Determine if the sentences given are 'entailment', 'neutral', or 'contradiction'. +Language: spanish, acc: 77.50%, prompt: In your role as an implication analysis tool, evaluate the relationship between the given phrases and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: spanish, acc: 71.10%, prompt: Determine whether there is 'entailment', 'neutral', or 'contradiction' between the sentences given, using this text analysis tool, +Language: spanish, acc: 78.00%, prompt: Analyze the relationship between the two sentences and classify it as 'entailment', 'neutral', or 'contradiction' using this text classification tool, +Language: spanish, acc: 78.10%, prompt: Using this implication analysis tool, decide whether the sentences given are related by 'entailment', 'neutral', or 'contradiction'. +Language: spanish, acc: 77.40%, prompt: Classifies the relationship between the given phrases as 'entailment', 'neutral', or 'contradiction' using this text analysis tool, +Language: spanish, acc: 70.00%, prompt: Evaluate whether there is 'entailment', 'neutral', or 'contradiction' between the sentences provided using this text classification tool, +Language: spanish, acc: 78.20%, prompt: Using this implication analysis tool, decide whether the two sentences are related by 'entailment', 'neutral', or 'contradiction'. +Language: spanish, acc: 77.50%, prompt: Determine whether the given phrases are related by 'entailment', 'neutral', or 'contradiction' using this text analysis tool, +Language: spanish, acc: 77.90%, prompt: Analyze the relationship between the two sentences and classify it as 'entailment', 'neutral', or 'contradiction' using this text analysis tool, +Language: spanish, acc: 77.90%, prompt: Using this text classification tool, it classifies the relationship between the given phrases as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 77.20%, prompt: As your role as an implication analysis tool, evaluate the relationship of a given sentence and classify it as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 76.90%, prompt: Use the implication analysis tool as your role to evaluate the relationship of a given sentence and classify it as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 78.40%, prompt: Use this text classification tool to categorize relationships in a given text as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 78.10%, prompt: Use the implication analysis tool as your role and classify the relationship of a given sentence as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 78.90%, prompt: Evaluate the relationship of a given sentence and use this text classification tool to classify it as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 78.30%, prompt: Evaluate the relationship of a given sentence and use this text classification tool to accurately classify it as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 78.00%, prompt: Use the implication analysis tool as your role and use this text classification tool to classify the relationship of a given sentence as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 78.70%, prompt: Use this text classification tool to evaluate the relationship of a given sentence and classify it as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 77.70%, prompt: Use the implication analysis tool as your role, evaluate the relationship of a given sentence, and use this text classification tool to classify it as 'entailment', 'neutral', or 'contradiction'. +Language: japanese, acc: 77.60%, prompt: Use the implication analysis tool as your role and categorize the relationship of a given sentence strictly as 'entailment', 'neutral', or 'contradiction' using this text classification tool. +Language: korean, acc: 78.20%, prompt: Analyze the relationships between given sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: korean, acc: 79.30%, prompt: In the text categorization task, identify the relationship between given sentences as one of 'entailment', 'neutral', or 'contradiction'. +Language: korean, acc: 76.70%, prompt: Perform the role of analyzing the relationship between sentences and classifying them as 'entailment', 'neutral', or 'contradiction'. +Language: korean, acc: 78.80%, prompt: Evaluate the relationship between two given sentences, and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: korean, acc: 77.40%, prompt: In the text categorization task, perform the role of classifying relationships between given sentences as 'entailment', 'neutral', or 'contradiction'. +Language: korean, acc: 76.80%, prompt: Judge the associations between sentences, and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: korean, acc: 77.80%, prompt: Analyze the relationship between two given sentences and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: korean, acc: 79.20%, prompt: In the task of text classification, identify the relationships between given sentences as 'entailment', 'neutral', or 'contradiction'. +Language: korean, acc: 77.00%, prompt: Act as an instrument to evaluate the relationships between sentences, and classify them as 'entailment', 'neutral', or 'contradiction'. +Language: korean, acc: 76.10%, prompt: Analyze the associations of two given sentences and classify them as 'entailment', 'neutral', or 'contradiction'. + +# mrpc + +## 10 prompts + +Acc: 87.50%, prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'. +Acc: 87.25%, prompt: Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent'. +Acc: 86.52%, prompt: Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent'. +Acc: 86.27%, prompt: Compare the following sentences and determine if they share the same semantic meaning by responding with 'equivalent' or 'not_equivalent'. +Acc: 86.03%, prompt: Evaluate whether the two provided sentences convey the same meaning by answering 'equivalent' or 'not_equivalent'. +Acc: 85.54%, prompt: Do these two sentences have the same underlying meaning? Respond with 'equivalent' or 'not_equivalent'. +Acc: 85.29%, prompt: Do the meanings of these two statements align? Indicate your answer with 'equivalent' or 'not_equivalent'. +Acc: 85.29%, prompt: Examine the pair of sentences and decide if their meanings are identical by answering with 'equivalent' or 'not_equivalent'. +Acc: 84.80%, prompt: Determine if the meanings of the following sentences are semantically equivalent by responding with 'equivalent' or 'not_equivalent'. +Acc: 82.35%, prompt: Assess if the two given sentences have equivalent meanings by selecting 'equivalent' or 'not_equivalent'. + +Acc: 88.73%, prompt: As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent'. +Acc: 87.99%, prompt: In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent'. +Acc: 87.01%, prompt: As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. +Acc: 86.76%, prompt: As a language comparison expert, examine the given pair of sentences and decide if their meanings align, answering with 'equivalent' or 'not_equivalent'. +Acc: 86.52%, prompt: In the role of a textual comparison specialist, examine the given sentences and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'. +Acc: 86.52%, prompt: In your capacity as a semantic comparison tool, examine the given sentences and decide if their meanings are identical, responding with 'equivalent' or 'not_equivalent'. +Acc: 86.03%, prompt: In the role of a sentence comparison analyst, assess the provided sentences and indicate if they convey the same meaning by selecting 'equivalent' or 'not_equivalent'. +Acc: 84.56%, prompt: As a linguistic comparator, review the following pair of sentences and determine their semantic equivalence by choosing 'equivalent' or 'not_equivalent'. +Acc: 83.58%, prompt: As a semantic comparison expert, evaluate the given pair of sentences and determine if they are 'equivalent' or 'not_equivalent'. +Acc: 83.09%, prompt: In your capacity as a semantic assessment tool, evaluate the provided sentences and classify their meanings as 'equivalent' or 'not_equivalent'. + +## bertattack + +Original prompt: As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a sentence discrimination evaluator, analyze the created sentences and indicate if their contexts are 'equivalent' or 'not_equivalent'. " +Original acc: 88.97%, attacked acc: 82.35%, dropped acc: 6.62% + +Original prompt: In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent'. +Attacked prompt: b"during your capacity as a grammatical analytic, calculate the following words and classify their difference as 'equivalent' or 'not_equivalent'. " +Original acc: 88.24%, attacked acc: 70.10%, dropped acc: 18.14% + +Original prompt: As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. +Attacked prompt: b"consider a grammatical distinction .lumethod, analysis the relevant combination of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. " +Original acc: 87.01%, attacked acc: 73.77%, dropped acc: 13.24% + +Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"are the questions of the other equivalent of sentences the same? Answer with 'equivalent' or 'not_equivalent'. " +Original acc: 87.50%, attacked acc: 44.61%, dropped acc: 42.89% + +Original prompt: Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"can the following sentences be considered semanticmore inferior? this reply with 'equivalent' or 'not_equivalent'. " +Original acc: 87.25%, attacked acc: 34.80%, dropped acc: 52.45% + +Original prompt: Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent'. +Attacked prompt: b"please analyze the standard words and indicates if their plural are the same by choosing 'equivalent' or 'not_equivalent'. " +Original acc: 86.76%, attacked acc: 67.65%, dropped acc: 19.12% + +## checklist + +Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent' 5XeflW1ZJc. " +Original acc: 87.50%, attacked acc: 86.27%, dropped acc: 1.23% + +Original prompt: Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent' IhuBIhoPGc. " +Original acc: 87.25%, attacked acc: 86.03%, dropped acc: 1.23% + +Original prompt: Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent' MSAddJ4D2a. " +Original acc: 86.52%, attacked acc: 85.05%, dropped acc: 1.47% + +Original prompt: As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent' ofOQXLxiW4. " +Original acc: 88.73%, attacked acc: 87.99%, dropped acc: 0.74% + +Original prompt: In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent'. +Attacked prompt: b"In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent' MQnugHcaoy. " +Original acc: 87.99%, attacked acc: 87.50%, dropped acc: 0.49% + +Original prompt: As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent' MQnugHcaoy. " +Original acc: 87.01%, attacked acc: 87.01%, dropped acc: 0.00% + +## deepwordbug + +Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Are the mKeanings of the flolowing pair of sentences the same? Agnswer with 'equivalent' or 'not_equivalent'. " +Original acc: 87.50%, attacked acc: 85.05%, dropped acc: 2.45% + +Original prompt: Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Can the given sentences be consiedred semantically idePntical? Plaese reply with 'equivalent' or 'not_equivalent'. " +Original acc: 87.25%, attacked acc: 83.58%, dropped acc: 3.68% + +Original prompt: Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Please analyze the provided senetnces and indicatU if their meannigs are the same by choosing 'equivalent' or 'not_equivalent'. " +Original acc: 86.52%, attacked acc: 84.80%, dropped acc: 1.72% + +Original prompt: As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent'. +Attacked prompt: b"A a sentWnce sidmilarity evaluator, analyze the provAded sentences and idnicate if their meanVngs are 'equivalent' or 'not_equivalent'. " +Original acc: 88.73%, attacked acc: 81.13%, dropped acc: 7.60% + +Original prompt: In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent'. +Attacked prompt: b"In your chapacity as a language anabyst, assess the following Psentences and classify their simglarity as 'equivalent' or 'not_equivalent'. " +Original acc: 87.99%, attacked acc: 82.35%, dropped acc: 5.64% + +Original prompt: As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. +Attacked prompt: b"A a rextual simialrity evaluator, analyze the follwoing pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. " +Original acc: 87.01%, attacked acc: 82.84%, dropped acc: 4.17% + +## stresstest + +Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent' and false is not true . " +Original acc: 87.50%, attacked acc: 85.05%, dropped acc: 2.45% + +Original prompt: Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent' and false is not true . " +Original acc: 87.25%, attacked acc: 85.54%, dropped acc: 1.72% + +Original prompt: Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent' and false is not true . " +Original acc: 86.52%, attacked acc: 85.78%, dropped acc: 0.74% + +Original prompt: As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent' and false is not true . " +Original acc: 88.73%, attacked acc: 88.24%, dropped acc: 0.49% + +Original prompt: In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent'. +Attacked prompt: b"In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent' and false is not true . " +Original acc: 87.99%, attacked acc: 87.01%, dropped acc: 0.98% + +Original prompt: As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent' and false is not true . " +Original acc: 87.01%, attacked acc: 85.78%, dropped acc: 1.23% + +## textbugger + +Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Constitute the meanings of the followin\xc9\xa1 pair of sentences the same? Responding with 'equivalent' or 'not_equivalent'. " +Original acc: 87.50%, attacked acc: 85.78%, dropped acc: 1.72% + +Original prompt: Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Can the given sentences be cosidered semantically iedntical? Invite reply with 'equivalent' or 'not_equivalent'. " +Original acc: 87.01%, attacked acc: 83.33%, dropped acc: 3.68% + +Original prompt: Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Invited analyze the provide\xd4\x81 verdicts and indicate if their meanings are the same by selecting 'equivalent' or 'not_equivalent'. " +Original acc: 86.76%, attacked acc: 83.09%, dropped acc: 3.68% + +Original prompt: As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a senence simil arity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent'. " +Original acc: 88.97%, attacked acc: 83.58%, dropped acc: 5.39% + +Original prompt: In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent'. +Attacked prompt: b"In your capacity as a linguistics a\xd5\xb8alyst, evaluate the following sentences and classify their metaphor as 'equivalent' or 'not_equivalent'. " +Original acc: 87.99%, attacked acc: 81.13%, dropped acc: 6.86% + +Original prompt: As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a textual portrait evaluator, analyzes the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. " +Original acc: 87.01%, attacked acc: 82.84%, dropped acc: 4.17% + +## textfooler + +Original prompt: Are the meanings of the following pair of sentences the same? Answer with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Is the meaning of the henceforth peers of convicting the same? Cater with 'equivalent' or 'not_equivalent'. " +Original acc: 87.50%, attacked acc: 78.19%, dropped acc: 9.31% + +Original prompt: Can the given sentences be considered semantically identical? Please reply with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Can the given judgement be investigated semantically equivalent? Ask overreaction with 'equivalent' or 'not_equivalent'. " +Original acc: 87.01%, attacked acc: 83.09%, dropped acc: 3.92% + +Original prompt: Please analyze the provided sentences and indicate if their meanings are the same by choosing 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Asks analysis the conferred priors and specified if their meanings are the same by selecting 'equivalent' or 'not_equivalent'. " +Original acc: 86.76%, attacked acc: 75.98%, dropped acc: 10.78% + +Original prompt: As a sentence similarity evaluator, analyze the provided sentences and indicate if their meanings are 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a chastisement resemblance evaluator, analyze the handed priors and describing if their sens are 'equivalent' or 'not_equivalent'. " +Original acc: 88.97%, attacked acc: 78.19%, dropped acc: 10.78% + +Original prompt: In your capacity as a language analyst, assess the following sentences and classify their similarity as 'equivalent' or 'not_equivalent'. +Attacked prompt: b"At your power as a language psych, assess the later judgements and categories their metaphor as 'equivalent' or 'not_equivalent'. " +Original acc: 87.99%, attacked acc: 76.23%, dropped acc: 11.76% + +Original prompt: As a textual similarity evaluator, analyze the following pair of sentences and determine if they are semantically 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Since a textual likeness evaluator, explore the later pair of sentences and defines if they are semantically 'equivalent' or 'not_equivalent'. " +Original acc: 87.01%, attacked acc: 83.58%, dropped acc: 3.43% + +## translation + +Language: chinese, acc: 84.07%, prompt: As an expert in semantic comparison, evaluates two given sentences to determine whether they are 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 83.58%, prompt: From the point of view of semantic comparison, determine whether this pair of sentences is 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 84.56%, prompt: Using the semantic comparison method, determine whether the following two statements are 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 84.07%, prompt: For the following two sentences, determine whether they are 'equivalent' or 'not_equivalent' based on semantic comparison. +Language: chinese, acc: 84.80%, prompt: As an expert in semantic comparison, please evaluate the following two sentences and determine if they are 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 85.54%, prompt: Using semantic comparison techniques, determine whether the following two sentences are 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 84.31%, prompt: Please determine whether the following two sentences are 'equivalent' or 'not_equivalent' according to the standard of semantic comparison. +Language: chinese, acc: 84.80%, prompt: As an expert in the field of semantic comparison, please evaluate the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 85.05%, prompt: Using semantic comparison, determine whether the following two sentences are 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 85.05%, prompt: Determine whether the following two sentences are 'equivalent' or 'not_equivalent' based on semantic comparison. +Language: french, acc: 85.05%, prompt: As an expert in semantic comparison, evaluate the following pair of sentences and determine whether they are 'equivalent' or 'not_equivalent'. +Language: french, acc: 85.29%, prompt: Can you determine whether the following two sentences are 'equivalent' or 'not_equivalent' as a semantic comparison expert? +Language: french, acc: 85.05%, prompt: Using your expertise in semantic comparison, determine whether the following two sentences are 'equivalent' or 'not_equivalent'. +Language: french, acc: 88.24%, prompt: As a semantic comparison specialist, assess the similarity between the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'. +Language: french, acc: 85.05%, prompt: Are you able to determine whether the following two sentences are 'equivalent' or 'not_equivalent' as an expert in semantic comparison? +Language: french, acc: 83.82%, prompt: As a semantic comparison professional, evaluate the following pair of sentences and indicate whether they are 'equivalent' or 'not_equivalent'. +Language: french, acc: 85.29%, prompt: Can you determine whether the following two sentences have a 'equivalent' or 'not_equivalent' meaning as an expert in semantic comparison? +Language: french, acc: 89.22%, prompt: As an expert in semantic comparison, assess the similarity between the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'. +Language: french, acc: 85.05%, prompt: Using your expertise in semantic comparison, determine whether the following two sentences are 'equivalent' or 'not_equivalent' in terms of meaning. +Language: french, acc: 87.75%, prompt: As a semantic comparison professional, assess the similarity between the following two sentences and indicate whether they are 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 85.05%, prompt: As an expert in semantic comparison, evaluate the two given sentences and determine whether they are 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 83.82%, prompt: Based on my experience in semantic analysis, classify the following two sentences as 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 85.29%, prompt: As an expert in semantic comparison, analyze the following two sentences and classify them as 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 84.56%, prompt: Your task as an expert in semantic comparison is to evaluate the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 85.78%, prompt: As a semantic comparison specialist, analyze the two data statements and insert them into one of the following categories: 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 84.80%, prompt: Based on my experience in semantic analysis, classify the following two sentences between 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 83.82%, prompt: Your role as a semantic comparison specialist requires analyzing the two given sentences and determining whether they are 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 84.56%, prompt: As an experienced semantic analyst, classify the following two sentences as 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 82.35%, prompt: Your job as a semantic analyst evaluates the following two sentences as 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 83.09%, prompt: As a semantic analyst, determine whether the given sentences are 'equivalent' or 'not_equivalent' based on their relationship. +Language: spanish, acc: 82.84%, prompt: As an expert in semantic comparison, it evaluates the pair of sentences provided and determines whether they are 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 83.82%, prompt: Based on my experience in semantic analysis, classify the following two sentences as 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 86.27%, prompt: As an expert in semantic comparison, analyze the two sentences given and classify them as 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 85.05%, prompt: Your task as a semantic comparison specialist is to evaluate the following two sentences and determine whether they are 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 84.80%, prompt: As an expert in semantic analysis, he makes a classification of the following two sentences based on their 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 85.78%, prompt: Based on your experience of semantic comparison, classify the next two sentences as 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 84.31%, prompt: As a specialist in semantic analysis, you are given the task of analysing the two sentences given and classifying them as 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 84.56%, prompt: As an expert in semantic comparison, he classifies the following two sentences into 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 84.31%, prompt: As a specialist in semantic analysis, evaluate the following two sentences and classify them as 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 83.58%, prompt: Your task as an expert in semantic comparison is to analyze the two sentences provided and determine whether they are 'equivalent' or 'not_equivalent' based on their semantic relationship. +Language: japanese, acc: 84.07%, prompt: Evaluate whether a given pair of sentences is 'equivalent' or 'not_equivalent', depending on the context. +Language: japanese, acc: 84.31%, prompt: Use a semantic comparison to determine whether a given pair of sentences is 'equivalent' or 'not_equivalent'. +Language: japanese, acc: 81.37%, prompt: Evaluate a given pair of sentences as 'equivalent' or 'not_equivalent' by determining whether they have the same semantic meaning. +Language: japanese, acc: 84.56%, prompt: Determine whether a given pair of sentences is synonyms and evaluate whether they are 'equivalent' or 'not_equivalent'. +Language: japanese, acc: 85.54%, prompt: Determine whether a given pair of sentences is 'equivalent' or 'not_equivalent', and whether they are semantically identical. +Language: japanese, acc: 84.31%, prompt: Determinate whether a given pair of sentences has the same meaning and evaluate whether they are 'equivalent' or 'not_equivalent'. +Language: japanese, acc: 84.56%, prompt: Evaluate whether a given pair of sentences is 'equivalent' or 'not_equivalent' by determining whether they are semantically identical. +Language: japanese, acc: 84.80%, prompt: Judge whether a given pair of sentences is equal and evaluate whether they are 'equivalent' or 'not_equivalent'. +Language: japanese, acc: 83.82%, prompt: Determinate whether a given pair of sentences are semantically equal and evaluate whether they are 'equivalent' or 'not_equivalent'. +Language: japanese, acc: 81.13%, prompt: Whether a given pair of sentences is 'equivalent' or 'not_equivalent' depends on the context. +Language: korean, acc: 86.03%, prompt: As a sentence comparator, evaluate the two sentences given to determine 'equivalent' or 'not_equivalent'. +Language: korean, acc: 86.27%, prompt: Compare two sentences to determine 'equivalent' or 'not_equivalent'. For this you need qualifications as a specialist in semantic comparison. +Language: korean, acc: 82.84%, prompt: It takes your knowledge as an expert in semantic comparison to determine that two sentences are 'equivalent' or 'not_equivalent'. +Language: korean, acc: 83.09%, prompt: As a specialist in semantic comparison, evaluate whether two given sentences are 'equivalent' or 'not_equivalent'. +Language: korean, acc: 86.27%, prompt: Analyze two sentences to determine 'equivalent' or 'not_equivalent'. For that you need the knowledge of a semantic comparison expert. +Language: korean, acc: 83.82%, prompt: As an expert in semantic comparison, decide whether two sentences are 'equivalent' or 'not_equivalent'. +Language: korean, acc: 84.07%, prompt: It takes the knowledge of an expert in semantic comparison to compare two sentences to judge 'equivalent' or 'not_equivalent'. +Language: korean, acc: 83.09%, prompt: Experience as an expert in semantic comparison is required to determine whether two given sentences are 'equivalent' or 'not_equivalent'. +Language: korean, acc: 84.56%, prompt: As an expert in semantic comparison, determine whether two sentences are 'equivalent' or 'not_equivalent'. +Language: korean, acc: 86.76%, prompt: Analyze two sentences to determine 'equivalent' or 'not_equivalent'. For this, you need a qualification as a specialist in semantic comparison. + +# qnli + +## 10 prompts + +Acc: 94.60%, prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'. +Acc: 94.10%, prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'. +Acc: 94.10%, prompt: Based on the information in the context, decide if the answer to the question is justified by choosing 'entailment' or 'not_entailment'. +Acc: 94.00%, prompt: Given the question and context provided, determine if the answer can be inferred by choosing 'entailment' or 'not_entailment'. +Acc: 93.90%, prompt: Analyze the context and question, and indicate if the context entails the answer by choosing 'entailment' or 'not_entailment'. +Acc: 93.80%, prompt: Examine the context and question, and determine if the context logically implies the answer by selecting 'entailment' or 'not_entailment'. +Acc: 93.70%, prompt: Consider the context and question, and indicate if the answer can be logically deduced from the context by responding with 'entailment' or 'not_entailment'. +Acc: 93.70%, prompt: Review the given context and question, and decide if the context contains enough information to support the answer by selecting 'entailment' or 'not_entailment'. +Acc: 93.30%, prompt: Assess if the answer to the question can be logically concluded from the provided context by choosing 'entailment' or 'not_entailment'. +Acc: 93.20%, prompt: Please assess if the answer to the question can be derived from the given context by selecting 'entailment' or 'not_entailment'. + +Acc: 94.20%, prompt: As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment'. +Acc: 94.20%, prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. +Acc: 94.10%, prompt: In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'. +Acc: 94.00%, prompt: As a semantic interpreter, assess whether the provided context supports the answer to the given question and answer with 'entailment' or 'not_entailment'. +Acc: 93.90%, prompt: In the capacity of a language specialist, decide if the context presented contains enough information to infer the answer to the question and respond with 'entailment' or 'not_entailment'. +Acc: 93.90%, prompt: In the capacity of a language evaluator, examine if the given context justifies the answer to the question and indicate your assessment with 'entailment' or 'not_entailment'. +Acc: 93.80%, prompt: As a language expert, assess if the given context entails the answer to the question and respond with 'entailment' or 'not_entailment'. +Acc: 93.60%, prompt: As a textual analyst, examine if the given context logically implies the answer to the question and indicate your decision with 'entailment' or 'not_entailment'. +Acc: 93.50%, prompt: As a textual inference expert, analyze if the answer to the question can be deduced from the provided context and select 'entailment' or 'not_entailment'. +Acc: 93.30%, prompt: As a linguistic consultant, decide if the answer to the question is logically supported by the provided context and respond with 'entailment' or 'not_entailment'. + +## bertattack + +Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"1 on the supplied context and question, then if the answer supports the answer by responding with 'entailment' or 'not_entailment'. " +Original acc: 94.60%, attacked acc: 89.30%, dropped acc: 5.30% + +Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"interpret positive the given context supports the answer to the answer by appearing with 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 73.70%, dropped acc: 20.40% + +Original prompt: Based on the information in the context, decide if the answer to the question is justified by choosing 'entailment' or 'not_entailment'. +Attacked prompt: b"Based on the info in the context, decide if the object to the question is wrong by choosing 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 52.70%, dropped acc: 41.40% + +Original prompt: As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment'. +Attacked prompt: b"while a semantic researcher, evaluate positive the supported context returns the answer to the question and reject 'entailment' or 'not_entailment'. " +Original acc: 94.20%, attacked acc: 42.60%, dropped acc: 51.60% + +Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. +Attacked prompt: b"about your assignment as a linguistic investigations, consider if the context let entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. " +Original acc: 94.20%, attacked acc: 92.80%, dropped acc: 1.40% + +Original prompt: In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"consider your roles as a consequence evaluator, score if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 92.90%, dropped acc: 1.20% + +## checklist + +Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment' EAB4KP2NVY. " +Original acc: 94.60%, attacked acc: 94.50%, dropped acc: 0.10% + +Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment' 9Q3S1F94fE. " +Original acc: 94.10%, attacked acc: 94.00%, dropped acc: 0.10% + +Original prompt: Given the question and context provided, determine if the answer can be inferred by choosing 'entailment' or 'not_entailment'. +Attacked prompt: b"Given the question and context provided, determine if the answer can be inferred by choosing 'entailment' or 'not_entailment' CuVpbbkC4c. " +Original acc: 94.00%, attacked acc: 93.90%, dropped acc: 0.10% + +Original prompt: As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment'. +Attacked prompt: b"As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment' SFPCqpiClT. " +Original acc: 94.20%, attacked acc: 94.00%, dropped acc: 0.20% + +Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. +Attacked prompt: b"In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment' 6a4Yn3RGVc. " +Original acc: 94.20%, attacked acc: 93.40%, dropped acc: 0.80% + +Original prompt: In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment' Vq3aBzuZcD. " +Original acc: 94.10%, attacked acc: 94.00%, dropped acc: 0.10% + +## deepwordbug + +Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Baesd on the provided contet and question, deicde if the informatio supports the answer by responding with 'entailment' or 'not_entailment'. " +Original acc: 94.60%, attacked acc: 94.20%, dropped acc: 0.40% + +Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 94.10%, dropped acc: 0.50% + +Original prompt: Given the question and context provided, determine if the answer can be inferred by choosing 'entailment' or 'not_entailment'. +Attacked prompt: b"Given the question and context provided, deermine if the answer can be inferred by choosing 'entailment' or 'not_entailment'. " +Original acc: 94.00%, attacked acc: 93.80%, dropped acc: 0.20% + +Original prompt: As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment'. +Attacked prompt: b"us a sAmantic researJcher, evZluate hwether the provided context suports the answer to the question and choose 'entailment' or 'not_entailment'. " +Original acc: 94.20%, attacked acc: 93.30%, dropped acc: 0.90% + +Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. +Attacked prompt: b"IF your Srole as a linguistic investigator, detGrmine if the context given entagls the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. " +Original acc: 94.20%, attacked acc: 93.50%, dropped acc: 0.70% + +Original prompt: In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"tn your role as a semanitc evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 93.70%, dropped acc: 0.40% + +## stresstest + +Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment' and true is true . " +Original acc: 94.60%, attacked acc: 94.20%, dropped acc: 0.40% + +Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment' and true is true . " +Original acc: 94.10%, attacked acc: 93.40%, dropped acc: 0.70% + +Original prompt: Given the question and context provided, determine if the answer can be inferred by choosing 'entailment' or 'not_entailment'. +Attacked prompt: b"Given the question and context provided, determine if the answer can be inferred by choosing 'entailment' or 'not_entailment' and true is true . " +Original acc: 94.00%, attacked acc: 94.10%, dropped acc: -0.10% + +Original prompt: As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment'. +Attacked prompt: b"As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment' and true is true and true is true and true is true and true is true and true is true . " +Original acc: 94.20%, attacked acc: 93.90%, dropped acc: 0.30% + +Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. +Attacked prompt: b"In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment' and true is true and true is true and true is true and true is true and true is true . " +Original acc: 94.20%, attacked acc: 93.80%, dropped acc: 0.40% + +Original prompt: In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment' and true is true . " +Original acc: 94.10%, attacked acc: 94.10%, dropped acc: 0.00% + +## textbugger + +Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Based on the provided context and question, deicde if the information favours the answer by responding with 'entailment' or 'not_entailment'. " +Original acc: 94.60%, attacked acc: 93.40%, dropped acc: 1.20% + +Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Evaluate whethe\xe2\xb2\x85 the given context favours the answer to the question by responding with 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 93.30%, dropped acc: 0.80% + +Original prompt: Based on the information in the context, decide if the answer to the question is justified by choosing 'entailment' or 'not_entailment'. +Attacked prompt: b"Basis on the information in the context, decided if the answer to the questions is justified by choosing 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 93.60%, dropped acc: 0.50% + +Original prompt: As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment'. +Attacked prompt: b"Iike a sem\xc9\x91ntic researcher, evaluate both the provided context supports the responding to the question and choose 'entailment' or 'not_entailment'. " +Original acc: 94.20%, attacked acc: 93.40%, dropped acc: 0.80% + +Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. +Attacked prompt: b"In your role as a linguistic searchers, \xd4\x81etermine if the context awarded presupposes the answer to the questions and provide your conclusions with 'entailment' or 'not_entailment'. " +Original acc: 94.20%, attacked acc: 93.10%, dropped acc: 1.10% + +Original prompt: In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"Throughout your role as a semanti\xcf\xb2 evaluator, \xd4\x81etermine if the provided context justifies the answer to the question and respond with 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 93.20%, dropped acc: 0.90% + +## textfooler + +Original prompt: Based on the provided context and question, decide if the information supports the answer by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Based on the presented context and question, decide if the indications backs the answer by responding with 'entailment' or 'not_entailment'. " +Original acc: 94.60%, attacked acc: 93.40%, dropped acc: 1.20% + +Original prompt: Evaluate whether the given context supports the answer to the question by responding with 'entailment' or 'not_entailment'. +Attacked prompt: b"Evaluate whether the delivered context admits the rebuttal to the topic by responding with 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 90.40%, dropped acc: 3.70% + +Original prompt: Based on the information in the context, decide if the answer to the question is justified by choosing 'entailment' or 'not_entailment'. +Attacked prompt: b"Baseline on the detail in the framework, chosen if the answers to the topic is motivated by selector 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 67.10%, dropped acc: 27.00% + +Original prompt: As a semantic researcher, evaluate whether the provided context supports the answer to the question and choose 'entailment' or 'not_entailment'. +Attacked prompt: b"Iike a semantic savant, determine whether the provided framework encourages the rebuttal to the topics and option 'entailment' or 'not_entailment'. " +Original acc: 94.20%, attacked acc: 52.70%, dropped acc: 41.50% + +Original prompt: In your role as a linguistic investigator, determine if the context given entails the answer to the question and provide your conclusion with 'entailment' or 'not_entailment'. +Attacked prompt: b"In your role as a linguistic investigator, determination if the context given entails the rebuttal to the matters and giving your conclusion with 'entailment' or 'not_entailment'. " +Original acc: 94.20%, attacked acc: 92.00%, dropped acc: 2.20% + +Original prompt: In your role as a semantic evaluator, determine if the provided context justifies the answer to the question and answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"Throughout your role as a semantic evaluator, defined if the provided context justifying the rebuttal to the issuing and rebuttal with 'entailment' or 'not_entailment'. " +Original acc: 94.10%, attacked acc: 86.60%, dropped acc: 7.50% + +## translation + +Language: chinese, acc: 93.60%, prompt: Determine whether the answer is implied or not based on the context. Answer 'entailment' or 'not_entailment'. +Language: chinese, acc: 94.60%, prompt: As a language expert, determine whether the text contains the answer and choose 'entailment' or 'not_entailment'. +Language: chinese, acc: 94.00%, prompt: Determine whether the text implies an answer, and select 'entailment' or 'not_entailment'. +Language: chinese, acc: 93.60%, prompt: Given the text and the question, whether the answer is implied, select 'entailment' or 'not_entailment'. +Language: chinese, acc: 94.40%, prompt: Determine whether the text contains the answer, depending on the context. Select 'entailment' or 'not_entailment'. +Language: chinese, acc: 94.80%, prompt: As a language expert, determine whether a text can contain an answer, and say 'entailment' or 'not_entailment'. +Language: chinese, acc: 94.20%, prompt: Please determine whether the text implies an answer. Answer 'entailment' or 'not_entailment'. +Language: chinese, acc: 93.70%, prompt: Please select 'entailment' or 'not_entailment' based on the text and the question. +Language: chinese, acc: 94.10%, prompt: Assess whether the answer is implied based on the context. Answer 'entailment' or 'not_entailment'. +Language: chinese, acc: 94.50%, prompt: Please determine whether the text contains the answer and answer 'entailment' or 'not_entailment'. +Language: french, acc: 93.90%, prompt: As a linguistic expert, assess whether the given context involves the answer to the question and answer with 'entailment' or 'not_entailment'. +Language: french, acc: 94.70%, prompt: Determine whether the information provided in the context necessarily leads to the answer to the question asked and indicate 'entailment' or 'not_entailment'. +Language: french, acc: 93.50%, prompt: Analyze the text to determine if the answer to the question is implied in the context and specify 'entailment' or 'not_entailment'. +Language: french, acc: 93.50%, prompt: Based on the given context, decide whether the answer to the question is necessarily involved and mark 'entailment' or 'not_entailment'. +Language: french, acc: 93.50%, prompt: Evaluate whether the answer to the question can be deduced from the given context and mark 'entailment' or 'not_entailment'. +Language: french, acc: 94.40%, prompt: Discern whether the context provided directly involves the answer to the question and indicate 'entailment' or 'not_entailment'. +Language: french, acc: 93.90%, prompt: Determine if the context contains enough information to involve the answer to the question and mark 'entailment' or 'not_entailment'. +Language: french, acc: 93.90%, prompt: Assess whether the context provided necessarily leads to the answer to the question and answer with 'entailment' or 'not_entailment'. +Language: french, acc: 93.20%, prompt: Analyze the text to determine if the answer to the question is involved in the context and indicate 'entailment' or 'not_entailment'. +Language: french, acc: 93.80%, prompt: Based on the given context, decide whether the answer to the question is necessarily inferred and mark 'entailment' or 'not_entailment'. +Language: arabic, acc: 94.40%, prompt: As a language expert, evaluate whether the given context calls for an answer and answer 'entailment' or 'not_entailment'. +Language: arabic, acc: 94.20%, prompt: Judge the relationship between the text and the question and answer 'entailment' or 'not_entailment', depending on your language experience. +Language: arabic, acc: 93.80%, prompt: Does the context given indicate the answer to the question? Evaluate and answer 'entailment' or 'not_entailment'. +Language: arabic, acc: 94.40%, prompt: Based on your linguistic knowledge, does the text relate to the question? Answer 'entailment' or 'not_entailment'. +Language: arabic, acc: 94.40%, prompt: As a language expert, determine how the text relates to the question and answer 'entailment' or 'not_entailment'. +Language: arabic, acc: 93.90%, prompt: Does the text support the answer to the question? Answer 'entailment' or 'not_entailment', depending on your language experience. +Language: arabic, acc: 93.70%, prompt: Check the text link to the question and answer 'entailment' or 'not_entailment', depending on your language skills. +Language: arabic, acc: 93.80%, prompt: As a language expert, is there a link between the text and the question? Answer 'entailment' or 'not_entailment'. +Language: arabic, acc: 94.60%, prompt: Based on your language experience, does context help to answer the question? Evaluate and answer 'entailment' or 'not_entailment'. +Language: arabic, acc: 94.30%, prompt: Does the text give a clear answer to the question? Answer 'entailment' or 'not_entailment', depending on your language experience. +Language: spanish, acc: 94.00%, prompt: As a language expert, evaluate whether the given context implies the answer to the question and answer with 'entailment' or 'not_entailment'. +Language: spanish, acc: 93.80%, prompt: Determine whether the information given in the text necessarily implies the veracity of the hypothesis and answer 'entailment' or 'not_entailment'. +Language: spanish, acc: 95.30%, prompt: Analyzes whether the information presented in the paragraph leads to the conclusion of the question and labels the answer as 'entailment' or 'not_entailment'. +Language: spanish, acc: 94.90%, prompt: Indicates whether the information provided in the text is sufficient to conclude the statement and labels the response as 'entailment' or 'not_entailment'. +Language: spanish, acc: 94.30%, prompt: As an expert on the subject, judge whether the information provided in the text justifies the claim and classify the answer as 'entailment' or 'not_entailment'. +Language: spanish, acc: 94.60%, prompt: Evaluates whether the information in the paragraph necessarily supports the conclusion of the hypothesis and responds 'entailment' or 'not_entailment'. +Language: spanish, acc: 93.90%, prompt: Determines whether the information presented in the text logically implies the answer to the question and labels the answer as 'entailment' or 'not_entailment'. +Language: spanish, acc: 94.90%, prompt: Analyzes whether the information provided in the paragraph necessarily leads to the veracity of the hypothesis and classifies the response as 'entailment' or 'not_entailment'. +Language: spanish, acc: 94.10%, prompt: As an expert on the subject, evaluate whether the information presented in the text supports the claim and respond 'entailment' or 'not_entailment'. +Language: spanish, acc: 94.30%, prompt: Indicates whether the information provided in the paragraph necessarily implies the answer to the question and labels the answer as 'entailment' or 'not_entailment'. +Language: japanese, acc: 93.70%, prompt: Rate whether the answer to the question is derived from the given context and answer with 'entailment' or 'not_entailment'. +Language: japanese, acc: 94.00%, prompt: Please answer 'entailment' or 'not_entailment' for the given context and question. +Language: japanese, acc: 93.50%, prompt: Decide whether the answer to the question is derived from the given context and answer 'entailment' or 'not_entailment'. +Language: japanese, acc: 93.40%, prompt: Compare the question with the given context and give the answer 'entailment' or 'not_entailment'. +Language: japanese, acc: 94.30%, prompt: Determinate whether the given context contains the answer to the question and answer with 'entailment' or 'not_entailment'. +Language: japanese, acc: 93.10%, prompt: Estimate the answer of the question from the context and give the answer 'entailment' or 'not_entailment'. +Language: japanese, acc: 94.50%, prompt: Determinate whether the given context is relevant to the question and answer with 'entailment' or 'not_entailment'. +Language: japanese, acc: 94.70%, prompt: Determine whether the given context is relevant to the question and answer with 'entailment' or 'not_entailment'. +Language: japanese, acc: 94.40%, prompt: Determinate whether the given context contains the answer to the question and answer 'entailment' or 'not_entailment'. +Language: japanese, acc: 93.40%, prompt: Answer with 'entailment' or 'not_entailment', inferring from the given context. +Language: korean, acc: 94.00%, prompt: Determine if a given sentence necessarily implies the meaning of another sentence and answer 'entailment' or 'not_entailment'. +Language: korean, acc: 94.70%, prompt: By understanding the relations between sentences, judge whether a given sentence necessarily refers to another sentence and answer with 'entailment' or 'not_entailment'. +Language: korean, acc: 94.40%, prompt: Evaluate whether a given text necessarily indicates the meaning of another text and respond with 'entailment' or 'not_entailment'. +Language: korean, acc: 94.30%, prompt: Understand the relations of a sentence, to determine whether a given sentence necessarily includes other sentences and answer with 'entailment' or 'not_entailment'. +Language: korean, acc: 93.70%, prompt: Judge whether a given content necessarily implies the meaning of another content and answer with 'entailment' or 'not_entailment'. +Language: korean, acc: 94.10%, prompt: Grasp the relations between sentences, determine if a given sentence necessarily contains the meaning of another sentence and respond with 'entailment' or 'not_entailment'. +Language: korean, acc: 94.30%, prompt: Evaluate whether a given text necessarily refers to another text and answer with 'entailment' or 'not_entailment'. +Language: korean, acc: 94.10%, prompt: By comparing the meaning of the sentences, to determine if a given sentence necessarily implies another sentence and answer 'entailment' or 'not_entailment'. +Language: korean, acc: 94.30%, prompt: Evaluate whether the contents given necessarily refer to other contents and answer with 'entailment' or 'not_entailment'. +Language: korean, acc: 78.90%, prompt: By analyzing the relations between sentences, determine if a given sentence does not necessarily include other sentences and answer with 'entailment' or 'not_entailment'. + +# qqp + +## 10 prompts + +Acc: 88.30%, prompt: Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent'. +Acc: 88.20%, prompt: Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'. +Acc: 88.20%, prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'. +Acc: 87.90%, prompt: Are the meanings of these two phrases the same? Reply with 'equivalent' or 'not_equivalent'. +Acc: 87.90%, prompt: Analyze if the given set of sentences have the same connotation by answering with 'equivalent' or 'not_equivalent'. +Acc: 87.80%, prompt: Are the following two questions equivalent or not? Answer me with "equivalent" or "not_equivalent". +Acc: 87.60%, prompt: Assess whether the following statements are identical in meaning by answering 'equivalent' or 'not_equivalent'. +Acc: 87.60%, prompt: Do the following expressions mean the same thing? Provide your answer as 'equivalent' or 'not_equivalent'. +Acc: 87.30%, prompt: Examine the following expressions and tell me if they are alike in meaning by using 'equivalent' or 'not_equivalent'. +Acc: 87.30%, prompt: Evaluate whether these two phrases have identical meanings and respond with 'equivalent' or 'not_equivalent'. +Acc: 87.20%, prompt: Can these two statements be considered equal in meaning? Answer with 'equivalent' or 'not_equivalent'. + +Acc: 89.20%, prompt: While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions. +Acc: 88.50%, prompt: As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent'. +Acc: 88.20%, prompt: Functioning as a question similarity evaluation tool, analyze the given questions and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'. +Acc: 88.20%, prompt: In the capacity of a question assessment system, indicate if the meaning of the provided questions is the same, responding with 'equivalent' or 'not_equivalent'. +Acc: 88.20%, prompt: Working as a question classifier, identify whether the given questions share the same meaning, and respond with 'equivalent' for equivalent questions or 'not_equivalent' for different ones. +Acc: 88.20%, prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions. +Acc: 88.20%, prompt: In the role of a question similarity analyzer, examine the meaning of the provided questions and decide if they are equivalent, answering with 'equivalent' for equivalent questions or 'not_equivalent' for different questions. +Acc: 88.00%, prompt: In your role as a question comparison tool, assess the following pair of questions and classify them as 'equivalent' or 'not_equivalent'. +Acc: 88.00%, prompt: Acting as a question equivalence instrument, determine if the provided questions are equivalent in meaning, answering with 'equivalent' for similar questions or 'not_equivalent' for dissimilar ones. +Acc: 87.70%, prompt: As a question equivalence detection system, examine the provided questions and respond with 'equivalent' if they are the same in meaning, or 'not_equivalent' if they are different. + +## bertattack + +Original prompt: Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Do these both plural with the same meaning? Indicate with 'equivalent' or 'not_equivalent'. " +Original acc: 88.30%, attacked acc: 74.90%, dropped acc: 13.40% + +Original prompt: Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"compute if the and for of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 87.60%, dropped acc: 0.60% + +Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Please indicate if the following individual of verbs share the same meaning by auxiliary with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 70.80%, dropped acc: 17.40% + +Original prompt: While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions. +Attacked prompt: b"you following question comparisons analysis, classify the number of the following words as 'equivalent' for equivalent questions or 'not_equivalent' for combined questions. " +Original acc: 89.20%, attacked acc: 81.80%, dropped acc: 7.40% + +Original prompt: As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a tool for weighing question equivalence, review the data and codeegoassociate their difference as neither 'equivalent' or 'not_equivalent'. " +Original acc: 88.50%, attacked acc: 70.90%, dropped acc: 17.60% + +Original prompt: Functioning as a question similarity evaluation tool, analyze the given questions and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"i as a question field assessing tool, compare the said questions and decide if they lack the same meaning, responding with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 83.80%, dropped acc: 4.40% + +## checklist + +Original prompt: Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent' EAB4KP2NVY. " +Original acc: 88.30%, attacked acc: 88.60%, dropped acc: -0.30% + +Original prompt: Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent' MSAddJ4D2a. " +Original acc: 88.30%, attacked acc: 87.30%, dropped acc: 1.00% + +Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent' MSAddJ4D2a. " +Original acc: 88.20%, attacked acc: 88.10%, dropped acc: 0.10% + +Original prompt: While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions. +Attacked prompt: b"While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions XStsD2pUzu. " +Original acc: 89.20%, attacked acc: 89.10%, dropped acc: 0.10% + +Original prompt: As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent' bhrRSokrfa. " +Original acc: 88.50%, attacked acc: 88.90%, dropped acc: -0.40% + +Original prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions. +Attacked prompt: b"As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions SFPCqpiClT. " +Original acc: 88.30%, attacked acc: 88.40%, dropped acc: -0.10% + +## deepwordbug + +Original prompt: Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Determine if the gimen piar of statemnets can be considered the same by responding with 'equivalent' or 'not_equivalent'. " +Original acc: 88.30%, attacked acc: 88.00%, dropped acc: 0.30% + +Original prompt: Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Do these two sefntences convky the same meaning? Indicate with 'equivalent' or 'not_equivalent'. " +Original acc: 88.30%, attacked acc: 87.60%, dropped acc: 0.70% + +Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"qPlease indicate if the folloing pair of senFences share the same meaning by resopnding with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 87.40%, dropped acc: 0.80% + +Original prompt: While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions. +Attacked prompt: b"While performing question comparison analysis, classify the similairty of the folfowing uestions as 'equivalent' for equivalent questions or 'not_equivalent' for different uqestions. " +Original acc: 89.20%, attacked acc: 87.70%, dropped acc: 1.50% + +Original prompt: As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a tool for determining question equivalence, review the questions and ategorize their similvarity as eitehr 'equivalent' or 'not_equivalent'. " +Original acc: 88.50%, attacked acc: 87.30%, dropped acc: 1.20% + +Original prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions. +Attacked prompt: b"AJ an instrument for question comparisno evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions. " +Original acc: 88.30%, attacked acc: 87.60%, dropped acc: 0.70% + +## stresstest + +Original prompt: Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent' and false is not true . " +Original acc: 88.30%, attacked acc: 88.20%, dropped acc: 0.10% + +Original prompt: Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent' and false is not true . " +Original acc: 88.30%, attacked acc: 86.70%, dropped acc: 1.60% + +Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent' and false is not true . " +Original acc: 88.20%, attacked acc: 87.30%, dropped acc: 0.90% + +Original prompt: While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions. +Attacked prompt: b"While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions and true is true and true is true and true is true and true is true and true is true . " +Original acc: 89.20%, attacked acc: 88.60%, dropped acc: 0.60% + +Original prompt: As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent' and true is true . " +Original acc: 88.50%, attacked acc: 88.20%, dropped acc: 0.30% + +Original prompt: As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions. +Attacked prompt: b"As an instrument for question comparison evaluation, consider the questions and determine if their meaning is the same, responding with 'equivalent' for similar questions or 'not_equivalent' for different questions and true is true . " +Original acc: 88.30%, attacked acc: 88.00%, dropped acc: 0.30% + +## textbugger + +Original prompt: Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Do these two verdicts convey the same connotation? Denotes with 'equivalent' or 'not_equivalent'. " +Original acc: 88.30%, attacked acc: 86.40%, dropped acc: 1.90% + +Original prompt: Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Determine if the given couple of declaration can be considered the same by responding with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 87.90%, dropped acc: 0.30% + +Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Plaese indicate if the later pair of verdicts share the same meaning by responding with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 87.40%, dropped acc: 0.80% + +Original prompt: While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions. +Attacked prompt: b"While performing question comparison analyse, classify the analogy of the later questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions. " +Original acc: 89.20%, attacked acc: 87.80%, dropped acc: 1.40% + +Original prompt: As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a tool for deciding question equivalence, review the problems and categorize their metaphor as neither 'equivalent' or 'not_equivalent'. " +Original acc: 88.50%, attacked acc: 78.20%, dropped acc: 10.30% + +Original prompt: Functioning as a question similarity evaluation tool, analyze the given questions and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Functioning as a question metaphor evaluation to\xd0\xbel, anlyze the afforded questions and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 87.10%, dropped acc: 1.10% + +## textfooler + +Original prompt: Do these two sentences convey the same meaning? Indicate with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Doing these two condemning communicate the same feeling? Stating with 'equivalent' or 'not_equivalent'. " +Original acc: 88.30%, attacked acc: 77.60%, dropped acc: 10.70% + +Original prompt: Determine if the given pair of statements can be considered the same by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Determine if the given couple of statements can be viewed the same by responding with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 87.80%, dropped acc: 0.40% + +Original prompt: Please indicate if the following pair of sentences share the same meaning by responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Asking clarified if the ensuing pair of phrases portion the same meaning by answer with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 87.00%, dropped acc: 1.20% + +Original prompt: While performing question comparison analysis, classify the similarity of the following questions as 'equivalent' for equivalent questions or 'not_equivalent' for different questions. +Attacked prompt: b"While performing subjects compare examined, classify the analogy of the later questions as 'equivalent' for equivalent problems or 'not_equivalent' for different questions. " +Original acc: 89.20%, attacked acc: 87.30%, dropped acc: 1.90% + +Original prompt: As a tool for determining question equivalence, review the questions and categorize their similarity as either 'equivalent' or 'not_equivalent'. +Attacked prompt: b"As a utility for definition questions equivalence, revising the question and ranks their metaphor as either 'equivalent' or 'not_equivalent'. " +Original acc: 88.50%, attacked acc: 83.40%, dropped acc: 5.10% + +Original prompt: Functioning as a question similarity evaluation tool, analyze the given questions and decide if they share the same meaning, responding with 'equivalent' or 'not_equivalent'. +Attacked prompt: b"Functioning as a question resemblance evaluation mechanism, analyze the made questions and defining if they communicate the same meaning, responding with 'equivalent' or 'not_equivalent'. " +Original acc: 88.20%, attacked acc: 87.20%, dropped acc: 1.00% + +## translation + +Language: chinese, acc: 88.20%, prompt: As a tool for determining the equivalence of problems, the problems are examined and their similarity is classified as 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 89.00%, prompt: Please compare the similarity of two sentences and put them into the category of 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 88.80%, prompt: Two sentences are classified as 'equivalent' or 'not_equivalent' for their similarity. +Language: chinese, acc: 89.80%, prompt: You can determine how similar the questions are by comparing them and categorizing them as 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 88.90%, prompt: Using the method of contrast, the similarity of these problems is divided into two categories: 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 88.00%, prompt: By comparing these issues, you can classify them as 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 89.40%, prompt: To determine whether the questions are similar, put them into the category of 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 89.80%, prompt: Divide the similarity of these questions into 'equivalent' or 'not_equivalent' categories. +Language: chinese, acc: 88.80%, prompt: Using the similarity assessment tool, these questions were classified as 'equivalent' or 'not_equivalent'. +Language: chinese, acc: 89.30%, prompt: By analyzing the similarity of these problems, they are divided into categories of 'equivalent' or 'not_equivalent'. +Language: french, acc: 88.00%, prompt: As a tool to determine the equivalence of questions, review the questions and rank their similarity as 'equivalent' or 'not_equivalent'. +Language: french, acc: 89.00%, prompt: Please compare the similarity of two sentences and classify them as 'equivalent' or 'not_equivalent'. +Language: french, acc: 88.90%, prompt: Based on the similarity of two sentences, classify them as 'equivalent' or 'not_equivalent'. +Language: french, acc: 89.30%, prompt: You can determine the similarity between these questions by comparing them and classifying them as 'equivalent' or 'not_equivalent'. +Language: french, acc: 89.80%, prompt: Use a comparative method to divide the similarity of these questions into two categories: 'equivalent' or 'not_equivalent'. +Language: french, acc: 88.40%, prompt: By comparing these questions, you can classify them as 'equivalent' or 'not_equivalent'. +Language: french, acc: 88.40%, prompt: Determine whether these questions are similar or not, and then classify them as 'equivalent' or 'not_equivalent'. +Language: french, acc: 89.70%, prompt: Divide the similarity of these questions into two categories: 'equivalent' or 'not_equivalent'. +Language: french, acc: 88.80%, prompt: Use a similarity assessment tool to classify these questions as 'equivalent' or 'not_equivalent'. +Language: french, acc: 89.60%, prompt: By analyzing the similarity of these questions, you can divide them into two categories: 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 89.30%, prompt: As a tool for determining an equation of questions, review the questions and classify their similarity as either 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 89.10%, prompt: When using questions in the classification domain, please classify the similarity between the questions as 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 89.10%, prompt: To determine an equation of questions, you must review the questions and classify their similarity as 'equivalent' or 'not_equivalent'. +Language: arabic, acc: 88.30%, prompt: Questions can be classified as 'equivalent' or 'not_equivalent' when used to identify classifications. +Language: arabic, acc: 89.00%, prompt: Classification of question similarity as 'equivalent' or 'not_equivalent' is used as a tool to determine the classification of questions. +Language: arabic, acc: 89.00%, prompt: Classify the similarity of the questions as 'equivalent' or 'not_equivalent' to determine the equation of the questions. +Language: arabic, acc: 89.80%, prompt: Identifying the similarity of questions and classifying them as 'equivalent' or 'not_equivalent' is an important tool in determining the classification of questions. +Language: arabic, acc: 88.90%, prompt: When classifying questions, their similarity can be classified as 'equivalent' or 'not_equivalent' to determine the correct classification. +Language: arabic, acc: 88.50%, prompt: The similarity of questions should be classified as 'equivalent' or 'not_equivalent' when used to determine the equation of questions. +Language: arabic, acc: 89.50%, prompt: Identifying the similarity of questions and classifying them as 'equivalent' or 'not_equivalent' helps to correctly classify questions. +Language: spanish, acc: 88.40%, prompt: As a tool to determine the equivalence of questions, it reviews the questions and classifies their similarity as 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 89.30%, prompt: Evaluate the similarity between questions and classify them as 'equivalent' or 'not_equivalent' to determine their equivalence. +Language: spanish, acc: 88.60%, prompt: Determine whether two questions are 'equivalent' or 'not_equivalent' based on similarity and characteristics. +Language: spanish, acc: 89.10%, prompt: Classifies the similarity between questions as 'equivalent' or 'not_equivalent' to determine their equivalence. +Language: spanish, acc: 89.10%, prompt: Review the questions and rate them as 'equivalent' or 'not_equivalent' based on their similarity and content. +Language: spanish, acc: 88.70%, prompt: As part of the classification task of questions, it determines their equivalence by categorizing their similarity as 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 89.50%, prompt: Analyze the similarity between questions and classify them as 'equivalent' or 'not_equivalent' to determine their equivalence. +Language: spanish, acc: 89.10%, prompt: As a method of identifying the equivalence of questions, it categorizes their similarity as 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 88.70%, prompt: To determine the equivalence between questions, check their similarity and classify them as 'equivalent' or 'not_equivalent'. +Language: spanish, acc: 89.30%, prompt: Classify the similarity between questions as 'equivalent' or 'not_equivalent' to determine whether they are equivalent or not. +Language: japanese, acc: 88.80%, prompt: As a tool to determine the equivalence of the question, review the question and categorize its similarities into 'equivalent' or 'not_equivalent' categories. +Language: japanese, acc: 88.40%, prompt: Work on text sorting tasks labeled 'equivalent' or 'not_equivalent'. +Language: japanese, acc: 88.10%, prompt: For text classification tasks, use the labels 'equivalent' or 'not_equivalent' to determine the equivalence of statements. +Language: japanese, acc: 87.70%, prompt: In the MRPC dataset, use the labels 'equivalent' or 'not_equivalent' to classify the equivalence of statements. +Language: japanese, acc: 87.60%, prompt: As a tool for determining equivalence, check sentences and categorize them into 'equivalent' or 'not_equivalent' categories. +Language: japanese, acc: 87.60%, prompt: Use the labels 'equivalent' or 'not_equivalent' to determine the equivalence of statements in text classification tasks. +Language: japanese, acc: 88.10%, prompt: In the text classification task of the MRPC data set, classify the equivalence of statements with labels of 'equivalent' or 'not_equivalent'. +Language: japanese, acc: 88.20%, prompt: As a tool to determine the equivalence of statements, categorize statements into 'equivalent' or 'not_equivalent' categories. +Language: japanese, acc: 88.10%, prompt: In a text classification task, classify the equivalence of statements using labels of 'equivalent' or 'not_equivalent'. +Language: japanese, acc: 88.30%, prompt: Do a text classification task to determine the equivalence of statements, labeled 'equivalent' or 'not_equivalent'. +Language: korean, acc: 87.50%, prompt: Classify two given sentences as 'equivalent' or 'not_equivalent' by discriminating whether they have the same meaning. +Language: korean, acc: 88.90%, prompt: Determine sentence equivalence by judging the similarity of two sentences with 'equivalent' or 'not_equivalent'. +Language: korean, acc: 88.40%, prompt: Classify the similarity of sentences as 'equivalent' or 'not_equivalent' by judging whether two sentences have the same meaning. +Language: korean, acc: 88.60%, prompt: Determine if two given sentences are equivalent to each other, and classify their similarity as 'equivalent' or 'not_equivalent'. +Language: korean, acc: 88.70%, prompt: Compare two given sentences to determine sentence equivalence, and classify their similarities as 'equivalent' or 'not_equivalent'. +Language: korean, acc: 87.40%, prompt: Classify sentence equivalence as 'equivalent' or 'not_equivalent' by judging whether two sentences have the same meaning to each other. +Language: korean, acc: 88.80%, prompt: Determine if two sentences have the same meaning, and classify their similarities as 'equivalent' or 'not_equivalent'. +Language: korean, acc: 88.70%, prompt: Compare two given sentences to determine their equivalence, and classify their similarities as 'equivalent' or 'not_equivalent'. +Language: korean, acc: 88.80%, prompt: Review two sentences to evaluate sentence equivalence, and classify their similarities as 'equivalent' or 'not_equivalent'. +Language: korean, acc: 88.00%, prompt: Judge whether two sentences have the same meaning to each other, and determine the sentence equivalence with 'equivalent' or 'not_equivalent'. + +# rte + +## 10 prompts + +Acc: 94.22%, prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Acc: 94.22%, prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'. +Acc: 94.22%, prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'. +Acc: 93.50%, prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'. +Acc: 93.50%, prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'. +Acc: 93.14%, prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'. +Acc: 93.14%, prompt: Analyze the two provided sentences and decide if their relationship is 'entailment' or 'not_entailment'. +Acc: 93.14%, prompt: Review the two statements and categorize their relationship as either 'entailment' or 'not_entailment'. +Acc: 93.14%, prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'. +Acc: 93.14%, prompt: Considering the two sentences, identify if their relationship is 'entailment' or 'not_entailment'. +Acc: 92.78%, prompt: Indicate if the connection between the following sentences is 'entailment' or 'not_entailment'. + +Acc: 94.58%, prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Acc: 93.86%, prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment': +Acc: 93.86%, prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment': +Acc: 93.50%, prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment': +Acc: 93.50%, prompt: While performing entailment analysis, classify the relationship between the provided sentences as 'entailment' or 'not_entailment': +Acc: 93.50%, prompt: As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respond with 'entailment' or 'not_entailment': +Acc: 92.78%, prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment': +Acc: 92.78%, prompt: In the capacity of an entailment assessment system, indicate if the link between the following sentences is 'entailment' or 'not_entailment': +Acc: 92.42%, prompt: As a tool for determining entailment relationships, review the two statements and categorize their connection as either 'entailment' or 'not_entailment': +Acc: 92.42%, prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment': + +## bertattack + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'include the following two sentences entailment or not_entailment? find me with "entailment" or "not_entailment", just one adjective. ' +Original acc: 94.22%, attacked acc: 91.70%, dropped acc: 2.53% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'. +Attacked prompt: b"examine the parent of sentences and explain if they exhibit entailment or not_entailment. consider with neither 'entailment' or 'not_entailment'." +Original acc: 94.22%, attacked acc: 88.45%, dropped acc: 5.78% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"a whether the given intersection of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'." +Original acc: 94.22%, attacked acc: 92.78%, dropped acc: 1.44% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"starting as an entailment class,, identify whether the given individual of verb displays entailment or not_entailment. begin with 'entailment' or 'not_entailment':" +Original acc: 94.58%, attacked acc: 87.00%, dropped acc: 7.58% + +Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment': +Attacked prompt: b"[UNK] your role as an entailment analytic program, assess the relationship between the given verb and classify it as 'entailment' or 'not_entailment':" +Original acc: 93.86%, attacked acc: 86.64%, dropped acc: 7.22% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment': +Attacked prompt: b"Acting as an entailment on instrument, determine if the hit pair of verb appears entailment or not_entailment. answers with 'entailment' or 'not_entailment':" +Original acc: 93.86%, attacked acc: 83.03%, dropped acc: 10.83% + +## checklist + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word ofOQXLxiW4. ' +Original acc: 94.22%, attacked acc: 93.50%, dropped acc: 0.72% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'. +Attacked prompt: b"Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment' CuVpbbkC4c." +Original acc: 94.22%, attacked acc: 93.50%, dropped acc: 0.72% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment' EAB4KP2NVY." +Original acc: 94.22%, attacked acc: 93.14%, dropped acc: 1.08% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment' d6ZQ3u0GBQ:" +Original acc: 94.58%, attacked acc: 94.22%, dropped acc: 0.36% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment': +Attacked prompt: b"Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment' ofw9fEkN5R:" +Original acc: 93.86%, attacked acc: 93.50%, dropped acc: 0.36% + +Original prompt: As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respond with 'entailment' or 'not_entailment' 5XeflW1ZJc:" +Original acc: 93.86%, attacked acc: 93.50%, dropped acc: 0.36% + +## deepwordbug + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Ar the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one worAd. ' +Original acc: 94.22%, attacked acc: 93.50%, dropped acc: 0.72% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'. +Attacked prompt: b"Examine the pir of sentencs and determine if they exxibit entailment or not_entailment. Answer with eitfer 'entailment' or 'not_entailment'." +Original acc: 94.22%, attacked acc: 92.06%, dropped acc: 2.17% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"Identfy whether the given pair of sentences demonGtrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'." +Original acc: 94.22%, attacked acc: 93.50%, dropped acc: 0.72% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"Working as an entailment clasisfier, identify whether the given pair of sentences displayW entailment or not_entailment. Respond with 'entailment' or 'not_entailment':" +Original acc: 94.58%, attacked acc: 93.14%, dropped acc: 1.44% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment': +Attacked prompt: b"Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment':" +Original acc: 93.86%, attacked acc: 93.86%, dropped acc: 0.72% + +Original prompt: As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respodn with 'entailment' or 'not_entailment':" +Original acc: 93.86%, attacked acc: 92.78%, dropped acc: 1.08% + +## stresstest + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word and true is true and true is true and true is true and true is true and true is true . ' +Original acc: 94.22%, attacked acc: 91.70%, dropped acc: 2.53% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'. +Attacked prompt: b"Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment' and true is true and true is true and true is true and true is true and true is true ." +Original acc: 94.22%, attacked acc: 92.78%, dropped acc: 1.44% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment' and false is not true ." +Original acc: 94.22%, attacked acc: 93.14%, dropped acc: 1.08% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment' and true is true and true is true and true is true and true is true and true is true :" +Original acc: 94.58%, attacked acc: 93.14%, dropped acc: 1.44% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment': +Attacked prompt: b"Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment' and false is not true :" +Original acc: 93.86%, attacked acc: 93.50%, dropped acc: 0.36% + +Original prompt: As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respond with 'entailment' or 'not_entailment' and true is true and true is true and true is true and true is true and true is true :" +Original acc: 93.86%, attacked acc: 92.06%, dropped acc: 1.81% + +## textbugger + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Are the subsequent secondly sentencing entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. ' +Original acc: 94.22%, attacked acc: 89.89%, dropped acc: 4.33% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'. +Attacked prompt: b"Examine the pa ir of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'." +Original acc: 94.22%, attacked acc: 93.14%, dropped acc: 1.08% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"Identify whether the given pair of sentences demon strates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'." +Original acc: 94.22%, attacked acc: 93.50%, dropped acc: 0.72% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"Working as an entailment classfier, identify wether the given pair of punishments displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':" +Original acc: 94.58%, attacked acc: 91.70%, dropped acc: 2.89% + +Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment': +Attacked prompt: b"Throughout your role as an entailment analysis instrument, assess the relationship between the given punishments and classify it as 'entailment' or 'not_entailment':" +Original acc: 93.86%, attacked acc: 90.25%, dropped acc: 3.61% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment': +Attacked prompt: b"Acting as an entailment detection instrument, deetrmine if the given pair of sentences demonstrates entailment or not_entailment. Answ\xd0\xb5r with 'entailment' or 'not_entailment':" +Original acc: 93.86%, attacked acc: 92.78%, dropped acc: 1.08% + +## textfooler + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Represented the aftermath two phrases entailment or not_entailment? Address me with "entailment" or "not_entailment", just one verb. ' +Original acc: 94.22%, attacked acc: 91.34%, dropped acc: 2.89% + +Original prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'. +Attacked prompt: b"Examine the coupled of sentences and determine if they explanatory entailment or not_entailment. Reply with either 'entailment' or 'not_entailment'." +Original acc: 94.22%, attacked acc: 92.06%, dropped acc: 2.17% + +Original prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'. +Attacked prompt: b"Identify whether the given pair of condemnation prove entailment or not_entailment. Answer with 'entailment' or 'not_entailment'." +Original acc: 94.22%, attacked acc: 92.42%, dropped acc: 1.81% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"Working as an entailment classifier, defined whether the given coupled of sentences exhibit entailment or not_entailment. Reacting with 'entailment' or 'not_entailment':" +Original acc: 94.58%, attacked acc: 92.42%, dropped acc: 2.17% + +Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment': +Attacked prompt: b"At your role as an entailment analysis tool, assess the affairs between the given condemnation and classify it as 'entailment' or 'not_entailment':" +Original acc: 93.86%, attacked acc: 90.61%, dropped acc: 3.25% + +Original prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment': +Attacked prompt: b"Acting as an entailment discover instrument, defined if the given pair of sentences discloses entailment or not_entailment. Reaction with 'entailment' or 'not_entailment':" +Original acc: 93.86%, attacked acc: 92.42%, dropped acc: 1.44% + +## translation + +Language: chinese, acc: 92.78%, prompt: In the light of an implication analysis tool, evaluate the relationship between the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 91.34%, prompt: From the perspective of an implication analysis tool, determine whether there is an implication relationship in the following sentences by classifying them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 91.70%, prompt: Please use an implication analysis tool to determine whether an implication relationship exists in the following sentences by classifying them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 92.78%, prompt: Please evaluate the relation of the following sentences as 'entailment' or 'not_entailment' for the purpose of determining implication relation. +Language: chinese, acc: 92.42%, prompt: Please use the implication analysis tool to evaluate the relationships between the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 92.78%, prompt: For the purpose of determining implicative relations, analyze the relations of the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 92.42%, prompt: Please use the implication analysis tool to determine the relationship of the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 90.97%, prompt: Please use the implication judgment tool to assess the relevance of the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 92.78%, prompt: Please, with implication analysis as the main task, determine the relationships between the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 92.42%, prompt: Using the implication judgment as a criterion, analyze the relation of the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: french, acc: 92.42%, prompt: As an engagement analysis tool, evaluate the relationship between the given sentences and classify it as 'entailment' or 'not_entailment'. +Language: french, acc: 92.06%, prompt: Determine whether the given sentences involve one another or not as an implication analysis tool. Classify them accordingly as 'entailment' or 'not_entailment'. +Language: french, acc: 92.42%, prompt: Using implication analysis, evaluate whether the sentences provided have a logical relationship and categorize them as 'entailment' or 'not_entailment'. +Language: french, acc: 92.06%, prompt: As an engagement assessment tool, determine whether the sentences provided have a logical relationship and classify them as 'entailment' or 'not_entailment'. +Language: french, acc: 91.70%, prompt: As an implication classification tool, analyze the sentences provided to determine if there is a logical relationship and categorize them as 'entailment' or 'not_entailment'. +Language: french, acc: 91.70%, prompt: Using implication analysis, determine whether the given sentences have a cause-effect relationship and categorize them as 'entailment' or 'not_entailment'. +Language: french, acc: 92.78%, prompt: Evaluate the relationship between the given sentences using implication analysis and rank them accordingly as 'entailment' or 'not_entailment'. +Language: french, acc: 92.06%, prompt: As an engagement detection tool, determine whether the given sentences have a logical relationship and categorize them as 'entailment' or 'not_entailment'. +Language: french, acc: 92.06%, prompt: Using implication analysis, evaluate whether the sentences provided have a cause-effect relationship and rank them accordingly as 'entailment' or 'not_entailment'. +Language: french, acc: 92.06%, prompt: Determine whether the given sentences have a cause-effect relationship as an engagement analysis tool and categorize them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 93.14%, prompt: In your role as a tool for reasoning analysis, evaluate the relationship between given sentences and classify them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 92.78%, prompt: Can you determine whether this sentence is inferred from the other sentence? Classify it as 'entailment' or 'not_entailment'. +Language: arabic, acc: 92.42%, prompt: Using the tool of reasoning analysis, analyze the relationship between given sentences and classify them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 92.06%, prompt: Does this sentence represent a conclusion from the previous sentence? Classify it as 'entailment' or 'not_entailment'. +Language: arabic, acc: 92.78%, prompt: As a tool of reasoning analysis, evaluate the relationship of given sentences and classify them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 92.78%, prompt: Can this sentence be inferred from the previous sentence? Classify it as 'entailment' or 'not_entailment'. +Language: arabic, acc: 92.42%, prompt: Using a tool to analyze a conclusion, analyze the relationship between the two sentences and classify them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 92.06%, prompt: Is this a conclusion from the next sentence? Classify it as 'entailment' or 'not_entailment'. +Language: arabic, acc: 92.78%, prompt: As part of your task in analyzing a conclusion, evaluate the relationship between the two sentences and classify them as 'entailment' or 'not_entailment' based on their relationship. +Language: arabic, acc: 92.78%, prompt: Are you following this sentence directly from the previous one? Classify it as 'entailment' or 'not_entailment'. +Language: spanish, acc: 92.06%, prompt: In your role as an implication analysis tool, evaluate the relationship between the given phrases and classify them as 'entailment' or 'not_entailment'. +Language: spanish, acc: 94.22%, prompt: Determine whether the second sentence necessarily implies the first and label the relation as 'entailment', or as 'not_entailment' if not. +Language: spanish, acc: 93.50%, prompt: Classifies the relationship between these two sentences as 'entailment' if one necessarily implies the other, or as 'not_entailment' if not. +Language: spanish, acc: 93.14%, prompt: Evaluates whether the information in the second sentence is implied in the first and labels the relationship as 'entailment', or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 92.42%, prompt: Given a couple of phrases, label their relationship as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 91.34%, prompt: Analyzes the relationship between the phrases and classifies them as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 93.50%, prompt: Given two sentences, determine whether the second sentence is a necessary consequence of the first and label the relation as 'entailment', or as 'not_entailment' if not. +Language: spanish, acc: 92.78%, prompt: Evaluates whether the information presented in the second sentence is implicit in the first and labels the relationship as 'entailment', or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 92.42%, prompt: Classifies the relationship between the given phrases as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 92.78%, prompt: Determines whether the information provided in the second sentence is necessarily inferable from the first and labels the relationship as 'entailment', or as 'not_entailment' if not. +Language: japanese, acc: 92.42%, prompt: Analyze the relationship of a given sentence and classify it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 92.42%, prompt: Evaluate the semantic relationship of the sentence and classify it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 93.50%, prompt: Please judge the relationship between the given sentences and classify them as 'entailment' or 'not_entailment'. +Language: japanese, acc: 89.89%, prompt: Examine the paraphrases of a given sentence and classify them 'entailment' or 'not_entailment'. +Language: japanese, acc: 92.42%, prompt: Rate the similarity of a given sentence and categorize it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 90.97%, prompt: Determinate the semantic connections of a given sentence and classify it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 90.61%, prompt: Examine the semantic match of a given sentence and categorize it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 92.06%, prompt: Classify it as 'entailment' or 'not_entailment' based on the content of the sentence. +Language: japanese, acc: 92.42%, prompt: Analyze the relationship of a given sentence and classify it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 90.97%, prompt: Judge the semantic connections of a given sentence and categorize it as 'entailment' or 'not_entailment'. +Language: korean, acc: 93.50%, prompt: Evaluate the relationship between any two sentences given to you and classify you as 'entailment' or 'not_entailment'. +Language: korean, acc: 92.06%, prompt: Analyze the semantic deductive relations between sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 93.14%, prompt: Evaluate the logical relevance between sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 93.50%, prompt: Evaluate the interaction of two given sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 93.50%, prompt: Please check whether there is a semantic match between those two sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 93.50%, prompt: Compare information between given sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 93.86%, prompt: Please analyse the correlation between those two sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 92.42%, prompt: Evaluate the different meanings between given sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 92.78%, prompt: Compare the semantic structure of the two sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 93.50%, prompt: Evaluate the interactions between sentences and classify them as 'entailment' or 'not_entailment'. + +# sst2 + +## 10 prompts + +Acc: 96.44%, prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative': +Acc: 96.44%, prompt: Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment: +Acc: 96.33%, prompt: Review this statement and decide whether it has a 'positive' or 'negative' sentiment: +Acc: 96.22%, prompt: Evaluate the sentiment of the given text and classify it as 'positive' or 'negative': +Acc: 96.22%, prompt: Determine the overall sentiment of this sentence, categorizing it as 'positive' or 'negative': +Acc: 95.87%, prompt: Please identify the emotional tone of this passage: 'positive' or 'negative'? +Acc: 95.64%, prompt: Assess the mood of the following quote and determine if it's 'positive' or 'negative': +Acc: 95.30%, prompt: Considering the given phrase, would you say it carries a 'positive' or 'negative' connotation? +Acc: 95.07%, prompt: Given the context of this text, indicate if the emotion conveyed is 'positive' or 'negative': +Acc: 91.86%, prompt: After examining the following expression, label its emotion as either 'positive' or 'negative': + +Acc: 96.56%, prompt: In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement: +Acc: 96.33%, prompt: As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify: +Acc: 96.33%, prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify: +Acc: 96.22%, prompt: Serving as a sentiment evaluation model, determine if the given statement is 'positive' or 'negative'. Classify: +Acc: 96.10%, prompt: Acting as a sentiment evaluator, identify if the given sentence is 'positive' or 'negative'. Classify: +Acc: 95.87%, prompt: Emulating a sentiment classification system, indicate whether the provided text is 'positive' or 'negative'. Classify: +Acc: 95.76%, prompt: Functioning as a sentiment identification tool, assess if the following expression is 'positive' or 'negative'. Classify: +Acc: 95.64%, prompt: In the capacity of a sentiment classifier, decide whether the given quote is 'positive' or 'negative'. Classify: +Acc: 95.30%, prompt: As an emotion detector, determine if the provided passage conveys a 'positive' or 'negative' sentiment. Classify: +Acc: 94.04%, prompt: Taking on the role of an emotion classifier, specify if the provided phrase is 'positive' or 'negative'. Classify: + +## bertattack + +Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative': +Attacked prompt: b"Analyze the topic of this phrase and responding with neither 'positive' or 'negative': " +Original acc: 96.44%, attacked acc: 0.11%, dropped acc: 96.33% + +Original prompt: Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment: +Attacked prompt: b"consult the provided excerpt and mistake between 'positive' and 'negative' to describe its sentiments: " +Original acc: 96.44%, attacked acc: 95.87%, dropped acc: 0.57% + +Original prompt: Review this statement and decide whether it has a 'positive' or 'negative' sentiment: +Attacked prompt: b"Review this maxim and examine whether it has a 'positive' or 'negative' sentiments: " +Original acc: 96.33%, attacked acc: 95.41%, dropped acc: 0.92% + +Original prompt: In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement: +Attacked prompt: b"given the role of a sentiment categories tool, begin with 'positive' or 'negative' to find this statement: " +Original acc: 96.56%, attacked acc: 95.41%, dropped acc: 1.15% + +Original prompt: As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify: +Attacked prompt: b"like a sentiment classifier, identify neither the following text is 'positive' or 'negative'. from submit: " +Original acc: 96.33%, attacked acc: 43.69%, dropped acc: 52.64% + +Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify: +Attacked prompt: b"Working as a similarity analyzer, please print if the following text is 'positive' or 'negative'. nationality: " +Original acc: 96.33%, attacked acc: 92.78%, dropped acc: 3.56% + +## checklist + +Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative': +Attacked prompt: b"Analyze the tone of this statement and respond with either 'positive' or 'negative' Cev0s49fbe: " +Original acc: 96.56%, attacked acc: 95.76%, dropped acc: 0.80% + +Original prompt: Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment: +Attacked prompt: b"Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment Cev0s49fbe: " +Original acc: 96.44%, attacked acc: 96.10%, dropped acc: 0.34% + +Original prompt: Review this statement and decide whether it has a 'positive' or 'negative' sentiment: +Attacked prompt: b"Review this statement and decide whether it has a 'positive' or 'negative' sentiment EAB4KP2NVY: " +Original acc: 96.33%, attacked acc: 96.10%, dropped acc: 0.23% + +Original prompt: In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement: +Attacked prompt: b"In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement SvAp8RlOFn: " +Original acc: 96.56%, attacked acc: 95.99%, dropped acc: 0.57% + +Original prompt: As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify: +Attacked prompt: b"As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify 5XeflW1ZJc: " +Original acc: 96.33%, attacked acc: 95.87%, dropped acc: 0.46% + +Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify: +Attacked prompt: b"Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify 9vGXgnbWB8: " +Original acc: 96.33%, attacked acc: 95.76%, dropped acc: 0.57% + +## deepwordbug + +Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative': +Attacked prompt: b"AnalyVe the tone of this statelent and resBond with etiher 'positive' or 'negative': " +Original acc: 96.56%, attacked acc: 95.64%, dropped acc: 0.92% + +Original prompt: Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment: +Attacked prompt: b"Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment: " +Original acc: 96.44%, attacked acc: 96.44%, dropped acc: 0.11% + +Original prompt: Review this statement and decide whether it has a 'positive' or 'negative' sentiment: +Attacked prompt: b"Review this staftement and decide whether it has a 'positive' or 'negative' sentiment: " +Original acc: 96.33%, attacked acc: 96.10%, dropped acc: 0.23% + +Original prompt: In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement: +Attacked prompt: b"In the rol of a sentiment analysics tool, respoLnd with 'positive' or 'negative' to classify this statemen: " +Original acc: 96.56%, attacked acc: 95.87%, dropped acc: 0.69% + +Original prompt: As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify: +Attacked prompt: b"s a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please claSsify: " +Original acc: 96.33%, attacked acc: 95.76%, dropped acc: 0.57% + +Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify: +Attacked prompt: b"Working as a setniment analyzer, plTase indicate if the follAwing texxt is 'positive' or 'negative'. Classify: " +Original acc: 96.33%, attacked acc: 95.87%, dropped acc: 0.46% + +## stresstest + +Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative': +Attacked prompt: b"Analyze the tone of this statement and respond with either 'positive' or 'negative' and false is not true : " +Original acc: 96.56%, attacked acc: 92.89%, dropped acc: 3.67% + +Original prompt: Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment: +Attacked prompt: b"Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment and false is not true : " +Original acc: 96.44%, attacked acc: 94.04%, dropped acc: 2.41% + +Original prompt: Review this statement and decide whether it has a 'positive' or 'negative' sentiment: +Attacked prompt: b"Review this statement and decide whether it has a 'positive' or 'negative' sentiment and false is not true : " +Original acc: 96.33%, attacked acc: 90.94%, dropped acc: 5.39% + +Original prompt: In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement: +Attacked prompt: b"In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement and false is not true : " +Original acc: 96.56%, attacked acc: 73.85%, dropped acc: 22.71% + +Original prompt: As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify: +Attacked prompt: b"As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify and false is not true : " +Original acc: 96.33%, attacked acc: 94.38%, dropped acc: 1.95% + +Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify: +Attacked prompt: b"Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify and false is not true : " +Original acc: 96.33%, attacked acc: 91.17%, dropped acc: 5.16% + +## textbugger + +Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative': +Attacked prompt: b"Analysis the tone of this speeches and respond with or 'positive' or 'negative': " +Original acc: 96.44%, attacked acc: 95.76%, dropped acc: 0.69% + +Original prompt: Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment: +Attacked prompt: b"Reading the yielded excerpt and \xcf\xb2hoose between 'positive' and 'negative' to describe its sentiment: " +Original acc: 96.44%, attacked acc: 95.87%, dropped acc: 0.57% + +Original prompt: Review this statement and decide whether it has a 'positive' or 'negative' sentiment: +Attacked prompt: b"Review this speeches and dec\xd1\x96de wether it has a 'positive' or 'negative' sentiment: " +Original acc: 96.33%, attacked acc: 95.87%, dropped acc: 0.46% + +Original prompt: In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement: +Attacked prompt: b"At the role of a sentiment analysis to\xd0\xbel, res pond with 'positive' or 'negative' to classify this statement: " +Original acc: 96.56%, attacked acc: 95.87%, dropped acc: 0.69% + +Original prompt: As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify: +Attacked prompt: b"A\xd1\x95 a emotion classifier, determine whether the follow ing text is 'positive' or 'negative'. Inviting classify: " +Original acc: 96.33%, attacked acc: 93.92%, dropped acc: 2.41% + +Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify: +Attacked prompt: b"Working as a emotion an\xc9\x91lyzer, please indicate if the f\xd0\xbellowing text is 'positive' or 'negative'. Classify: " +Original acc: 96.33%, attacked acc: 95.53%, dropped acc: 0.80% + +## textfooler + +Original prompt: Analyze the tone of this statement and respond with either 'positive' or 'negative': +Attacked prompt: b"Analyzed the ringtone of this speech and respond with either 'positive' or 'negative': " +Original acc: 96.44%, attacked acc: 94.95%, dropped acc: 1.49% + +Original prompt: Read the provided excerpt and choose between 'positive' and 'negative' to describe its sentiment: +Attacked prompt: b"Reading the provided excerpt and opting between 'positive' and 'negative' to defines its emotion: " +Original acc: 96.44%, attacked acc: 95.07%, dropped acc: 1.38% + +Original prompt: Review this statement and decide whether it has a 'positive' or 'negative' sentiment: +Attacked prompt: b"Considering this speech and decide whether it has a 'positive' or 'negative' foreboding: " +Original acc: 96.33%, attacked acc: 94.72%, dropped acc: 1.61% + +Original prompt: In the role of a sentiment analysis tool, respond with 'positive' or 'negative' to classify this statement: +Attacked prompt: b"In the functionality of a sentiment analysis appliance, retaliated with 'positive' or 'negative' to classifications this statement: " +Original acc: 96.56%, attacked acc: 95.53%, dropped acc: 1.03% + +Original prompt: As a sentiment classifier, determine whether the following text is 'positive' or 'negative'. Please classify: +Attacked prompt: b"As a emotion classifier, detects whether the suite texts is 'positive' or 'negative'. Prayers classify: " +Original acc: 96.33%, attacked acc: 92.89%, dropped acc: 3.44% + +Original prompt: Working as a sentiment analyzer, please indicate if the following text is 'positive' or 'negative'. Classify: +Attacked prompt: b"Collaborated as a emotion analyzer, please depicting if the later text is 'positive' or 'negative'. Classify: " +Original acc: 96.33%, attacked acc: 94.95%, dropped acc: 1.38% + +## translation + +Language: chinese, acc: 96.44%, prompt: Answer whether the statement is 'positive' or 'negative' based on sentiment analysis. +Language: chinese, acc: 93.81%, prompt: As an emotion analysis tool, determine whether the emotion in the text is 'positive' or 'negative'. +Language: chinese, acc: 95.87%, prompt: Categorize the statement as 'positive' or 'negative', based on its emotional bent. +Language: chinese, acc: 96.33%, prompt: Please use sentiment analysis to classify the text as 'positive' or 'negative'. +Language: chinese, acc: 94.04%, prompt: Please determine whether the emotion of the sentence is 'positive' or 'negative' and categorize it. +Language: chinese, acc: 96.22%, prompt: Using sentiment analysis, classify the text as 'positive' or 'negative'. +Language: chinese, acc: 94.27%, prompt: Please answer whether the emotion of the sentence is 'positive' or 'negative' and categorize it. +Language: chinese, acc: 96.56%, prompt: Categorize the statement as 'positive' or 'negative' based on sentiment analysis. +Language: chinese, acc: 96.67%, prompt: Please judge this text as 'positive' or 'negative' according to the criteria of sentiment analysis. +Language: chinese, acc: 95.07%, prompt: Please classify this sentence as 'positive' or 'negative' according to the principles of emotion analysis. +Language: french, acc: 96.44%, prompt: As a sentiment analysis tool, please answer with 'positive' or 'negative' to classify this statement. +Language: french, acc: 95.87%, prompt: Determine whether this phrase is 'positive' or 'negative' as a sentiment classification tool. +Language: french, acc: 96.56%, prompt: Identify the tone of this statement by choosing between 'positive' and 'negative' as a sentiment analysis tool. +Language: french, acc: 96.33%, prompt: Use sentiment analysis to classify this statement as 'positive' or 'negative'. +Language: french, acc: 96.56%, prompt: As a sentiment classification tool, please determine whether this statement is 'positive' or 'negative'. +Language: french, acc: 96.44%, prompt: Classify this sentence as 'positive' or 'negative' using sentiment analysis. +Language: french, acc: 96.56%, prompt: Choose between 'positive' or 'negative' to classify this statement as a sentiment analysis tool. +Language: french, acc: 96.22%, prompt: Identify the sentiment expressed in this statement by selecting 'positive' or 'negative' as a sentiment classification tool. +Language: french, acc: 95.87%, prompt: Determine whether this phrase is 'positive' or 'negative' using sentiment analysis as a classification tool. +Language: french, acc: 96.33%, prompt: Use sentiment analysis to classify this statement as 'positive' or 'negative'. +Language: arabic, acc: 96.10%, prompt: Under emotional analysis, answer 'positive' or 'negative' to classify this statement. +Language: arabic, acc: 96.10%, prompt: Does this statement express a 'positive' or 'negative' reaction? +Language: arabic, acc: 96.10%, prompt: Is that a 'positive' or a 'negative' phrase? +Language: arabic, acc: 96.44%, prompt: What is the classification between 'positive' and 'negative'? +Language: arabic, acc: 96.22%, prompt: Does this sentence express 'positive' or 'negative' feelings? +Language: arabic, acc: 95.99%, prompt: In the context of textual analysis, what classification is this phrase between 'positive' and 'negative'? +Language: arabic, acc: 96.44%, prompt: Could this be classified as 'positive' or 'negative'? +Language: arabic, acc: 95.76%, prompt: In the context of emotional analysis, what classification is this statement between 'positive' and 'negative'? +Language: arabic, acc: 96.44%, prompt: Can this be classified as 'positive' or 'negative'? +Language: arabic, acc: 94.27%, prompt: Under the classification of emotions, is this sentence 'positive' or 'negative'? +Language: spanish, acc: 96.22%, prompt: As a feeling analysis tool, classify this statement as 'positive' or 'negative'. +Language: spanish, acc: 95.99%, prompt: Determine whether this statement has a 'positive' or 'negative' connotation. +Language: spanish, acc: 96.56%, prompt: Indicate whether the following statement is 'positive' or 'negative'. +Language: spanish, acc: 95.87%, prompt: Evaluate whether this text has a 'positive' or 'negative' emotional charge. +Language: spanish, acc: 96.33%, prompt: According to your sentiment analysis, would you say this comment is 'positive' or 'negative'? +Language: spanish, acc: 96.22%, prompt: In the context of sentiment analysis, label this sentence as 'positive' or 'negative'. +Language: spanish, acc: 96.67%, prompt: Rate the following statement as 'positive' or 'negative', according to your sentiment analysis. +Language: spanish, acc: 96.22%, prompt: How would you classify this text in terms of its emotional tone? 'positive' or 'negative'? +Language: spanish, acc: 96.33%, prompt: As a tool for sentiment analysis, would you say this statement is 'positive' or 'negative'? +Language: spanish, acc: 96.79%, prompt: Classify this statement as 'positive' or 'negative', please. +Language: japanese, acc: 94.84%, prompt: Treat this sentence as an emotion analysis tool and categorize it as 'positive' and 'negative'. +Language: japanese, acc: 96.22%, prompt: Use this article as a sentiment analysis tool to classify 'positive' and 'negative'. +Language: japanese, acc: 95.07%, prompt: Use this sentence as an emotion analysis tool to determine whether it is 'positive' or 'negative'. +Language: japanese, acc: 94.61%, prompt: Use this sentence as an emotion analysis tool to classify 'positive' and 'negative'. +Language: japanese, acc: 96.22%, prompt: Use this sentence as a sentiment analysis tool and classify it as 'positive' or 'negative'. +Language: japanese, acc: 96.79%, prompt: To classify this sentence as 'positive' or 'negative', evaluate it as a sentiment analysis tool. +Language: japanese, acc: 94.95%, prompt: Treat this sentence as an emotion analysis tool to determine whether it is 'positive' or 'negative'. +Language: japanese, acc: 95.87%, prompt: Use this sentence as a sentiment analysis tool to classify 'positive' and 'negative'. +Language: japanese, acc: 95.18%, prompt: Analyze this sentence as an emotion analysis tool to classify whether it is 'positive' or 'negative'. +Language: japanese, acc: 95.41%, prompt: Use this sentence as an emotional analysis tool to determine whether it is 'positive' or 'negative'. +Language: korean, acc: 95.87%, prompt: As an emotional analysis tool, respond with 'positive' or 'negative' to classify these sentences. +Language: korean, acc: 96.90%, prompt: Classify this sentence as 'positive' if you regard it as positive, 'negative' if you regard it as negative. +Language: korean, acc: 94.95%, prompt: Please rate the emotion of this sentence and classify it as 'positive' or 'negative'. +Language: korean, acc: 96.56%, prompt: Classify this sentence as 'positive' if you perceive it positively and 'negative' if you perceive it negatively. +Language: korean, acc: 95.87%, prompt: If this is a sentence delivered using a positive expression, classify it as 'positive' and if this is a sentence delivered using a negative expression, classify it as 'negative'. +Language: korean, acc: 96.79%, prompt: Respond with 'positive' or 'negative' by categorizing whether the sentence is positive or negative. +Language: korean, acc: 93.46%, prompt: Please analyze the emotion in this sentence and classify it as 'positive' or 'negative'. +Language: korean, acc: 96.22%, prompt: Classify this sentence as 'positive' if it contains a positive meaning, 'negative' if it contains a negative meaning. +Language: korean, acc: 96.56%, prompt: Classify this sentence as 'positive' if it contains positive content, 'negative' if it contains negative content. +Language: korean, acc: 96.33%, prompt: Classify what you're trying to convey in this sentence as 'positive' if it's positive, and 'negative' if it's negative. + +# wnli + +## 10 prompts + +Acc: 80.28%, prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'. +Acc: 78.87%, prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Acc: 78.87%, prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'. +Acc: 78.87%, prompt: Examine the pair of sentences and determine if they exhibit entailment or not_entailment. Answer with either 'entailment' or 'not_entailment'. +Acc: 78.87%, prompt: Identify whether the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment'. +Acc: 78.87%, prompt: Please classify the relationship between the provided sentences as 'entailment' or 'not_entailment'. +Acc: 78.87%, prompt: Indicate if the connection between the following sentences is 'entailment' or 'not_entailment'. +Acc: 78.87%, prompt: Determine if the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment'. +Acc: 77.46%, prompt: Analyze the two provided sentences and decide if their relationship is 'entailment' or 'not_entailment'. +Acc: 77.46%, prompt: Review the two statements and categorize their relationship as either 'entailment' or 'not_entailment'. +Acc: 77.46%, prompt: Considering the two sentences, identify if their relationship is 'entailment' or 'not_entailment'. + +Acc: 80.28%, prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment': +Acc: 80.28%, prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment': +Acc: 78.87%, prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment': +Acc: 78.87%, prompt: Acting as an entailment detection instrument, determine if the given pair of sentences demonstrates entailment or not_entailment. Answer with 'entailment' or 'not_entailment': +Acc: 78.87%, prompt: As a tool for determining entailment relationships, review the two statements and categorize their connection as either 'entailment' or 'not_entailment': +Acc: 78.87%, prompt: While performing entailment analysis, classify the relationship between the provided sentences as 'entailment' or 'not_entailment': +Acc: 78.87%, prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Acc: 78.87%, prompt: As an instrument for entailment evaluation, consider the two sentences and determine if their relationship is 'entailment' or 'not_entailment'. Respond with 'entailment' or 'not_entailment': +Acc: 77.46%, prompt: In the capacity of an entailment assessment system, indicate if the link between the following sentences is 'entailment' or 'not_entailment': +Acc: 77.46%, prompt: In the role of a semantic relationship analyzer, examine the connection between the given sentences and decide if they exhibit entailment or not_entailment. Answer with 'entailment' or 'not_entailment': + +## bertattack + +Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'. +Attacked prompt: b"a the connection between the verb sentences and classify it as 'entailment' or 'not_entailment'." +Original acc: 80.28%, attacked acc: 73.24%, dropped acc: 7.04% + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Are the main two sentences entailment or not_entailment? begin me with "entailment" or "not_entailment", just the word. ' +Original acc: 78.87%, attacked acc: 71.83%, dropped acc: 7.04% + +Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'. +Attacked prompt: b"Does the subject between the different verb representing entailment or not_entailment? Respond with 'entailment' or 'not_entailment'." +Original acc: 78.87%, attacked acc: 71.83%, dropped acc: 7.04% + +Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment': +Attacked prompt: b"she an entailment identification system, examine the ci between the following sentences and respond with 'entailment' or 'not_entailment':" +Original acc: 80.28%, attacked acc: 74.65%, dropped acc: 5.63% + +Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment': +Attacked prompt: b"Functioning as an entailment inspection tool, analyze the provided verbs and decide if their relationship is 'entailment' or 'not_entailment':" +Original acc: 80.28%, attacked acc: 73.24%, dropped acc: 7.04% + +Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment': +Attacked prompt: b"In your role as an entailment analysis tool, report the commitment between the and are and classify it as 'entailment' or 'not_entailment':" +Original acc: 78.87%, attacked acc: 70.42%, dropped acc: 8.45% + +## checklist + +Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'. +Attacked prompt: b"Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment' EAB4KP2NVY." +Original acc: 80.28%, attacked acc: 74.65%, dropped acc: 5.63% + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word 5XeflW1ZJc. ' +Original acc: 78.87%, attacked acc: 77.46%, dropped acc: 1.41% + +Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'. +Attacked prompt: b"Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment' ItGDrrA1Sc." +Original acc: 78.87%, attacked acc: 77.46%, dropped acc: 1.41% + +Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment': +Attacked prompt: b"As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment' SFPCqpiClT:" +Original acc: 80.28%, attacked acc: 76.06%, dropped acc: 4.23% + +Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment': +Attacked prompt: b"Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment' 5yWbBXztUY:" +Original acc: 80.28%, attacked acc: 74.65%, dropped acc: 5.63% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment' ItGDrrA1Sc:" +Original acc: 80.28%, attacked acc: 77.46%, dropped acc: 2.82% + +## deepwordbug + +Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'. +Attacked prompt: b"Nssess the connection between the fnollowing sentences and classify it as 'entailment' or 'not_entailment'." +Original acc: 80.28%, attacked acc: 76.06%, dropped acc: 4.23% + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Are the following wto sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. ' +Original acc: 78.87%, attacked acc: 76.06%, dropped acc: 2.82% + +Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'. +Attacked prompt: b"Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'." +Original acc: 78.87%, attacked acc: 78.87%, dropped acc: 0.00% + +Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment': +Attacked prompt: b"A an entailment identification system, examine the connectiGon between the following sentences and respond with 'entailment' or 'not_entailment':" +Original acc: 80.28%, attacked acc: 77.46%, dropped acc: 2.82% + +Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment': +Attacked prompt: b"Functioning as an entailment ealuation tCol, aSalyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment':" +Original acc: 80.28%, attacked acc: 76.06%, dropped acc: 4.23% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"Woorking as an entailment classifier, identify whether the given pCair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment':" +Original acc: 80.28%, attacked acc: 77.46%, dropped acc: 2.82% + +## stresstest + +Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'. +Attacked prompt: b"Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment' and false is not true ." +Original acc: 80.28%, attacked acc: 76.06%, dropped acc: 4.23% + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word and false is not true . ' +Original acc: 78.87%, attacked acc: 77.46%, dropped acc: 1.41% + +Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'. +Attacked prompt: b"Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment' and false is not true ." +Original acc: 78.87%, attacked acc: 78.87%, dropped acc: 0.00% + +Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment': +Attacked prompt: b"As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment' and false is not true :" +Original acc: 80.28%, attacked acc: 76.06%, dropped acc: 4.23% + +Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment': +Attacked prompt: b"Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment' and false is not true :" +Original acc: 80.28%, attacked acc: 74.65%, dropped acc: 5.63% + +Original prompt: Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment': +Attacked prompt: b"Working as an entailment classifier, identify whether the given pair of sentences displays entailment or not_entailment. Respond with 'entailment' or 'not_entailment' and false is not true :" +Original acc: 80.28%, attacked acc: 77.46%, dropped acc: 2.82% + +## textbugger + +Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'. +Attacked prompt: b"Assess the connecti\xd0\xben between the follow ing sentences and classify it as 'entailment' or 'not_entailment'." +Original acc: 80.28%, attacked acc: 76.06%, dropped acc: 4.23% + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Are the flolowing two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one adjective. ' +Original acc: 78.87%, attacked acc: 74.65%, dropped acc: 4.23% + +Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'. +Attacked prompt: b"Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'." +Original acc: 78.87%, attacked acc: 78.87%, dropped acc: 0.00% + +Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment': +Attacked prompt: b"As an entailment idenification system, examine the connecti\xd0\xben between the following sentences and respond with 'entailment' or 'not_entailment':" +Original acc: 80.28%, attacked acc: 77.46%, dropped acc: 2.82% + +Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment': +Attacked prompt: b"Functio\xd5\xb8ing as an entailment evaluation tool, analyze the provided sentences and dec\xd1\x96de if their rapport is 'entailment' or 'not_entailment':" +Original acc: 80.28%, attacked acc: 74.65%, dropped acc: 5.63% + +Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment': +Attacked prompt: b"In your role as an entailment analysis tool, assess the rela tionship between the given sentences and classify it as 'entailment' or 'not_entailment':" +Original acc: 78.87%, attacked acc: 76.06%, dropped acc: 2.82% + +## textfooler + +Original prompt: Assess the connection between the following sentences and classify it as 'entailment' or 'not_entailment'. +Attacked prompt: b"Assess the connections between the afterward punishment and classifies it as 'entailment' or 'not_entailment'." +Original acc: 80.28%, attacked acc: 67.61%, dropped acc: 12.68% + +Original prompt: Are the following two sentences entailment or not_entailment? Answer me with "entailment" or "not_entailment", just one word. +Attacked prompt: b'Are the aftermath two sentences entailment or not_entailment? Cope me with "entailment" or "not_entailment", just one adjective. ' +Original acc: 78.87%, attacked acc: 71.83%, dropped acc: 7.04% + +Original prompt: Does the relationship between the given sentences represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'. +Attacked prompt: b"Does the relationship between the given phrases represent entailment or not_entailment? Respond with 'entailment' or 'not_entailment'." +Original acc: 78.87%, attacked acc: 77.46%, dropped acc: 1.41% + +Original prompt: As an entailment identification system, examine the connection between the following sentences and respond with 'entailment' or 'not_entailment': +Attacked prompt: b"As an entailment id system, examine the connection between the following sentences and retaliated with 'entailment' or 'not_entailment':" +Original acc: 80.28%, attacked acc: 77.46%, dropped acc: 2.82% + +Original prompt: Functioning as an entailment evaluation tool, analyze the provided sentences and decide if their relationship is 'entailment' or 'not_entailment': +Attacked prompt: b"Functioning as an entailment assessments tool, analyze the provided sentences and decide if their affairs is 'entailment' or 'not_entailment':" +Original acc: 80.28%, attacked acc: 74.65%, dropped acc: 5.63% + +Original prompt: In your role as an entailment analysis tool, assess the relationship between the given sentences and classify it as 'entailment' or 'not_entailment': +Attacked prompt: b"In your role as an entailment analysis tool, assess the coitus between the given sentencing and classify it as 'entailment' or 'not_entailment':" +Original acc: 78.87%, attacked acc: 71.83%, dropped acc: 7.04% + +## translation + +Language: chinese, acc: 78.87%, prompt: In the light of an implication analysis tool, evaluate the relationship between the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 76.06%, prompt: From the perspective of an implication analysis tool, determine whether there is an implication relationship in the following sentences by classifying them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 77.46%, prompt: Please use an implication analysis tool to determine whether an implication relationship exists in the following sentences by classifying them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 77.46%, prompt: Please evaluate the relation of the following sentences as 'entailment' or 'not_entailment' for the purpose of determining implication relation. +Language: chinese, acc: 77.46%, prompt: Please use the implication analysis tool to evaluate the relationships between the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 74.65%, prompt: For the purpose of determining implicative relations, analyze the relations of the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 77.46%, prompt: Please use the implication analysis tool to determine the relationship of the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 77.46%, prompt: Please use the implication judgment tool to assess the relevance of the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 77.46%, prompt: Please, with implication analysis as the main task, determine the relationships between the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: chinese, acc: 77.46%, prompt: Using the implication judgment as a criterion, analyze the relation of the following sentences and classify them as 'entailment' or 'not_entailment'. +Language: french, acc: 78.87%, prompt: As an engagement analysis tool, evaluate the relationship between the given sentences and classify it as 'entailment' or 'not_entailment'. +Language: french, acc: 76.06%, prompt: Determine whether the given sentences involve one another or not as an implication analysis tool. Classify them accordingly as 'entailment' or 'not_entailment'. +Language: french, acc: 77.46%, prompt: Using implication analysis, evaluate whether the sentences provided have a logical relationship and categorize them as 'entailment' or 'not_entailment'. +Language: french, acc: 77.46%, prompt: As an engagement assessment tool, determine whether the sentences provided have a logical relationship and classify them as 'entailment' or 'not_entailment'. +Language: french, acc: 77.46%, prompt: As an implication classification tool, analyze the sentences provided to determine if there is a logical relationship and categorize them as 'entailment' or 'not_entailment'. +Language: french, acc: 76.06%, prompt: Using implication analysis, determine whether the given sentences have a cause-effect relationship and categorize them as 'entailment' or 'not_entailment'. +Language: french, acc: 77.46%, prompt: Evaluate the relationship between the given sentences using implication analysis and rank them accordingly as 'entailment' or 'not_entailment'. +Language: french, acc: 77.46%, prompt: As an engagement detection tool, determine whether the given sentences have a logical relationship and categorize them as 'entailment' or 'not_entailment'. +Language: french, acc: 80.28%, prompt: Using implication analysis, evaluate whether the sentences provided have a cause-effect relationship and rank them accordingly as 'entailment' or 'not_entailment'. +Language: french, acc: 76.06%, prompt: Determine whether the given sentences have a cause-effect relationship as an engagement analysis tool and categorize them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 78.87%, prompt: In your role as a tool for reasoning analysis, evaluate the relationship between given sentences and classify them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 78.87%, prompt: Can you determine whether this sentence is inferred from the other sentence? Classify it as 'entailment' or 'not_entailment'. +Language: arabic, acc: 78.87%, prompt: Using the tool of reasoning analysis, analyze the relationship between given sentences and classify them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 77.46%, prompt: Does this sentence represent a conclusion from the previous sentence? Classify it as 'entailment' or 'not_entailment'. +Language: arabic, acc: 77.46%, prompt: As a tool of reasoning analysis, evaluate the relationship of given sentences and classify them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 78.87%, prompt: Can this sentence be inferred from the previous sentence? Classify it as 'entailment' or 'not_entailment'. +Language: arabic, acc: 77.46%, prompt: Using a tool to analyze a conclusion, analyze the relationship between the two sentences and classify them as 'entailment' or 'not_entailment'. +Language: arabic, acc: 77.46%, prompt: Is this a conclusion from the next sentence? Classify it as 'entailment' or 'not_entailment'. +Language: arabic, acc: 78.87%, prompt: As part of your task in analyzing a conclusion, evaluate the relationship between the two sentences and classify them as 'entailment' or 'not_entailment' based on their relationship. +Language: arabic, acc: 77.46%, prompt: Are you following this sentence directly from the previous one? Classify it as 'entailment' or 'not_entailment'. +Language: spanish, acc: 77.46%, prompt: In your role as an implication analysis tool, evaluate the relationship between the given phrases and classify them as 'entailment' or 'not_entailment'. +Language: spanish, acc: 77.46%, prompt: Determine whether the second sentence necessarily implies the first and label the relation as 'entailment', or as 'not_entailment' if not. +Language: spanish, acc: 77.46%, prompt: Classifies the relationship between these two sentences as 'entailment' if one necessarily implies the other, or as 'not_entailment' if not. +Language: spanish, acc: 76.06%, prompt: Evaluates whether the information in the second sentence is implied in the first and labels the relationship as 'entailment', or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 73.24%, prompt: Given a couple of phrases, label their relationship as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 74.65%, prompt: Analyzes the relationship between the phrases and classifies them as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 76.06%, prompt: Given two sentences, determine whether the second sentence is a necessary consequence of the first and label the relation as 'entailment', or as 'not_entailment' if not. +Language: spanish, acc: 74.65%, prompt: Evaluates whether the information presented in the second sentence is implicit in the first and labels the relationship as 'entailment', or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 74.65%, prompt: Classifies the relationship between the given phrases as 'entailment' if one necessarily implies the other, or as 'not_entailment' if there is no such implication. +Language: spanish, acc: 77.46%, prompt: Determines whether the information provided in the second sentence is necessarily inferable from the first and labels the relationship as 'entailment', or as 'not_entailment' if not. +Language: japanese, acc: 77.46%, prompt: Analyze the relationship of a given sentence and classify it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 77.46%, prompt: Evaluate the semantic relationship of the sentence and classify it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 78.87%, prompt: Please judge the relationship between the given sentences and classify them as 'entailment' or 'not_entailment'. +Language: japanese, acc: 78.87%, prompt: Examine the paraphrases of a given sentence and classify them 'entailment' or 'not_entailment'. +Language: japanese, acc: 77.46%, prompt: Rate the similarity of a given sentence and categorize it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 76.06%, prompt: Determinate the semantic connections of a given sentence and classify it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 74.65%, prompt: Examine the semantic match of a given sentence and categorize it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 78.87%, prompt: Classify it as 'entailment' or 'not_entailment' based on the content of the sentence. +Language: japanese, acc: 77.46%, prompt: Analyze the relationship of a given sentence and classify it as 'entailment' or 'not_entailment'. +Language: japanese, acc: 74.65%, prompt: Judge the semantic connections of a given sentence and categorize it as 'entailment' or 'not_entailment'. +Language: korean, acc: 76.06%, prompt: Evaluate the relationship between any two sentences given to you and classify you as 'entailment' or 'not_entailment'. +Language: korean, acc: 76.06%, prompt: Analyze the semantic deductive relations between sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 77.46%, prompt: Evaluate the logical relevance between sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 76.06%, prompt: Evaluate the interaction of two given sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 77.46%, prompt: Please check whether there is a semantic match between those two sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 78.87%, prompt: Compare information between given sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 77.46%, prompt: Please analyse the correlation between those two sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 78.87%, prompt: Evaluate the different meanings between given sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 77.46%, prompt: Compare the semantic structure of the two sentences and classify them as 'entailment' or 'not_entailment'. +Language: korean, acc: 77.46%, prompt: Evaluate the interactions between sentences and classify them as 'entailment' or 'not_entailment'. + +# mmlu + +## 10 prompts + +Acc: 52.84%, prompt: For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D'. +Acc: 52.84%, prompt: Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D'. +Acc: 52.84%, prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'. +Acc: 52.48%, prompt: Below are multiple-choice question concerning {}. Indicate your response with 'A', 'B', 'C', or 'D'. +Acc: 52.48%, prompt: Examine the following question based on {} and choose the correct response from 'A', 'B', 'C', or 'D'. +Acc: 52.13%, prompt: Answer the following multiple-choice question about {} by selecting the correct option: 'A', 'B', 'C', or 'D'. +Acc: 52.13%, prompt: In relation to the multiple-choice question on {}, please provide the accurate answer by choosing 'A', 'B', 'C', or 'D'. +Acc: 52.13%, prompt: Answer the subsequent multiple-choice question about {} by picking the right option among 'A', 'B', 'C', or 'D'. +Acc: 51.95%, prompt: Regarding the following multiple-choice question on {}, pick the correct answer from the options 'A', 'B', 'C', or 'D'. +Acc: 51.60%, prompt: For each multiple-choice question about {}, identify the correct answer by selecting 'A', 'B', 'C', or 'D'. + +Acc: 53.19%, prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'. +Acc: 53.19%, prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'. +Acc: 53.01%, prompt: Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'. +Acc: 52.84%, prompt: With your knowledge of {}, tackle the following multiple-choice question by choosing 'A', 'B', 'C', or 'D'. +Acc: 52.66%, prompt: As someone well-versed in {}, please address the multiple-choice question below by selecting 'A', 'B', 'C', or 'D'. +Acc: 52.66%, prompt: Drawing upon your mastery of {}, please answer the multiple-choice question by selecting the correct option from 'A', 'B', 'C', or 'D'. +Acc: 52.48%, prompt: With your understanding of {}, kindly answer the subsequent multiple-choice question by selecting 'A', 'B', 'C', or 'D'. +Acc: 52.48%, prompt: As a skilled person in the field of {}, please respond to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'. +Acc: 51.95%, prompt: Utilizing your expertise in {}, answer the following multiple-choice question by picking 'A', 'B', 'C', or 'D'. +Acc: 51.95%, prompt: Considering your familiarity with {}, attend to the following multiple-choice question by picking 'A', 'B', 'C', or 'D'. + +## bertattack + +Original prompt: For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"i the multiple-choice question contributing to {}, please choose the most accurate blank from 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 51.42%, dropped acc: 1.42% + +Original prompt: Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D'. +Attacked prompt: b"not respond to the multiple-response question about {} by choosing the responses correctly: 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 17.38%, dropped acc: 35.46% + +Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"1 the multiple-choice question about {} and select the most matched response from 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 51.77%, dropped acc: 1.06% + +Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'. +Attacked prompt: b"a an artist in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'." +Original acc: 53.19%, attacked acc: 51.77%, dropped acc: 1.42% + +Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'. +Attacked prompt: b"not a knowledgeable example in {}, state your response to the multiple-is question by choosing 'A', 'B', 'C', or 'D'." +Original acc: 53.19%, attacked acc: 37.94%, dropped acc: 15.25% + +Original prompt: Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Given your it in {}, please answer the subsequent multiple-decision question with 'A', 'B', 'C', or 'D'." +Original acc: 53.01%, attacked acc: 51.60%, dropped acc: 1.42% + +## checklist + +Original prompt: For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D' XStsD2pUzu. " +Original acc: 52.84%, attacked acc: 51.95%, dropped acc: 0.89% + +Original prompt: Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D' 6a4Yn3RGVc. " +Original acc: 52.84%, attacked acc: 52.13%, dropped acc: 0.71% + +Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D' IhuBIhoPGc. " +Original acc: 52.84%, attacked acc: 51.95%, dropped acc: 0.89% + +Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'. +Attacked prompt: b"As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D' IhuBIhoPGc." +Original acc: 53.19%, attacked acc: 51.77%, dropped acc: 1.42% + +Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'. +Attacked prompt: b"As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D' LkF0FZxMZ4." +Original acc: 53.19%, attacked acc: 52.30%, dropped acc: 0.89% + +Original prompt: Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D' S6oUV5E54P." +Original acc: 53.01%, attacked acc: 52.30%, dropped acc: 0.71% + +## deepwordbug + +Original prompt: For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Fog the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 52.13%, dropped acc: 0.71% + +Original prompt: Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Pqlease respond to the multiple-choice question about {} by Aelecting the appropriate answer: 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 51.95%, dropped acc: 0.89% + +Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Evaluaet the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 52.66%, dropped acc: 0.18% + +Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'. +Attacked prompt: b"As an exert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'." +Original acc: 53.19%, attacked acc: 51.77%, dropped acc: 1.42% + +Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'. +Attacked prompt: b"sA a knowledgeable iSndividual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'." +Original acc: 53.19%, attacked acc: 52.13%, dropped acc: 1.06% + +Original prompt: Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Given your Vroficiency in {}, pleae answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'." +Original acc: 53.01%, attacked acc: 52.30%, dropped acc: 0.71% + +## stresstest + +Original prompt: For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D' and false is not true . " +Original acc: 52.84%, attacked acc: 53.37%, dropped acc: -0.53% + +Original prompt: Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D' and false is not true . " +Original acc: 52.84%, attacked acc: 51.60%, dropped acc: 1.24% + +Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D' and false is not true . " +Original acc: 52.84%, attacked acc: 52.30%, dropped acc: 0.53% + +Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'. +Attacked prompt: b"As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D' and true is true ." +Original acc: 53.19%, attacked acc: 52.66%, dropped acc: 0.53% + +Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'. +Attacked prompt: b"As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D' and true is true and true is true and true is true and true is true and true is true ." +Original acc: 53.19%, attacked acc: 52.48%, dropped acc: 0.71% + +Original prompt: Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D' and false is not true ." +Original acc: 53.01%, attacked acc: 52.48%, dropped acc: 0.53% + +## textbugger + +Original prompt: For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"In the multiple-choice question rel ated to {}, inviting choose the most accurate answer from 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 51.24%, dropped acc: 1.60% + +Original prompt: Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Please respond to the multiple-choice question about {} by seelcting the appropriate answ\xd0\xb5r: 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 51.42%, dropped acc: 1.42% + +Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Assessing the multipl\xd0\xb5-choice question about {} and select the most fitting reaction from 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 52.13%, dropped acc: 0.71% + +Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'. +Attacked prompt: b"As an e xpert in {}, respond to the follows multiple-choice question by selecting 'A', 'B', 'C', or 'D'." +Original acc: 53.19%, attacked acc: 51.95%, dropped acc: 1.24% + +Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'. +Attacked prompt: b"As a knowledgeable indiivdual in {}, provide your replies to the multiple-choice problem by choosing 'A', 'B', 'C', or 'D'." +Original acc: 53.19%, attacked acc: 51.24%, dropped acc: 1.95% + +Original prompt: Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Gi\xd1\xb5en your proficiency in {}, please answer the subsequent multiple-choic\xd0\xb5 question with 'A', 'B', 'C', or 'D'." +Original acc: 53.01%, attacked acc: 52.30%, dropped acc: 0.71% + +## textfooler + +Original prompt: For the multiple-choice question related to {}, please choose the most accurate answer from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"For the multiple-choice question bound to {}, please choose the most careful rebuttal from 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 43.44%, dropped acc: 9.40% + +Original prompt: Please respond to the multiple-choice question about {} by selecting the appropriate answer: 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Please respond to the multiple-choice question about {} by choices the appropriate responses: 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 51.24%, dropped acc: 1.60% + +Original prompt: Evaluate the multiple-choice question about {} and select the most fitting response from 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Evaluation the multiple-choice topic about {} and select the most splice rebuttal from 'A', 'B', 'C', or 'D'. " +Original acc: 52.84%, attacked acc: 32.09%, dropped acc: 20.74% + +Original prompt: As an expert in {}, respond to the following multiple-choice question by selecting 'A', 'B', 'C', or 'D'. +Attacked prompt: b"As an specializing in {}, react to the following multiple-choice question by picking 'A', 'B', 'C', or 'D'." +Original acc: 53.19%, attacked acc: 51.42%, dropped acc: 1.77% + +Original prompt: As a knowledgeable individual in {}, provide your response to the multiple-choice question by choosing 'A', 'B', 'C', or 'D'. +Attacked prompt: b"As a knowledgeable everybody in {}, prescribes your rebuttal to the multiple-choice item by chooses 'A', 'B', 'C', or 'D'." +Original acc: 53.19%, attacked acc: 39.36%, dropped acc: 13.83% + +Original prompt: Given your proficiency in {}, please answer the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'. +Attacked prompt: b"Provided your purview in {}, please cater the subsequent multiple-choice question with 'A', 'B', 'C', or 'D'." +Original acc: 53.01%, attacked acc: 51.60%, dropped acc: 1.42% + +## translation + +Language: chinese, acc: 52.30%, prompt: For the following multiple choice question about {}, please select the correct answer from 'A', 'B', 'C' or 'D'. +Language: chinese, acc: 51.42%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice question for {}. +Language: chinese, acc: 52.48%, prompt: Select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice question {}. +Language: chinese, acc: 52.48%, prompt: Please choose the correct answer from 'A', 'B', 'C' or 'D' according to the following multiple-choice questions about {}. +Language: chinese, acc: 52.30%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the {} multiple choice questions below. +Language: chinese, acc: 52.66%, prompt: The following is A multiple choice question about {}. Please select the correct answer from 'A', 'B', 'C' or 'D'. +Language: chinese, acc: 52.13%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice question {}. +Language: chinese, acc: 52.48%, prompt: Please choose the correct answer from 'A', 'B', 'C' or 'D' according to the following multiple-choice questions about {}. +Language: chinese, acc: 52.30%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice questions about {}. +Language: chinese, acc: 52.30%, prompt: Please select the correct answer from 'A', 'B', 'C' or 'D' for the following multiple choice questions about {}. +Language: french, acc: 52.48%, prompt: For the following multiple choice question on {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: french, acc: 52.48%, prompt: This is a multiple choice question about {}. Select the correct answer from options 'A', 'B', 'C' or 'D'. +Language: french, acc: 52.48%, prompt: In the context of the multiple-choice question on {}, identify the correct answer from options 'A', 'B', 'C' or 'D'. +Language: french, acc: 52.30%, prompt: About the following question on {}, determine the correct answer from the choices 'A', 'B', 'C' or 'D'. +Language: french, acc: 52.84%, prompt: Carefully review the multiple-choice question regarding {}. Choose the correct answer from options 'A', 'B', 'C', or 'D'. +Language: french, acc: 52.48%, prompt: For the multiple-choice question for {}, indicate the correct answer from options 'A', 'B', 'C', or 'D'. +Language: french, acc: 53.37%, prompt: The next question is about {}. Select the correct answer from the choices 'A', 'B', 'C' or 'D'. +Language: french, acc: 52.30%, prompt: As part of the multiple-choice question on {}, choose the appropriate answer from options 'A', 'B', 'C' or 'D'. +Language: french, acc: 53.19%, prompt: Rate your understanding of the multiple-choice question on {}. Choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: french, acc: 52.30%, prompt: Analyze the following multiple-choice question on {}. Identify the correct answer among choices 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 52.66%, prompt: For the multiple choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 51.95%, prompt: For the following multiple-choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 51.77%, prompt: For the following multiple choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 52.48%, prompt: When it comes to the multiple-choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 52.66%, prompt: For the multiple-choice question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 52.48%, prompt: If the question for {} is multiple choice, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 52.48%, prompt: For the question regarding {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 52.30%, prompt: For the question about {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 53.19%, prompt: When it comes to the question regarding {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: arabic, acc: 52.48%, prompt: For the question regarding {}, choose the correct answer from options 'A', 'B', 'C' or 'D'. +Language: spanish, acc: 51.42%, prompt: For the following multiple-choice question about {}, choose the correct answer from 'A', 'B', 'C', or 'D'. +Language: spanish, acc: 51.77%, prompt: For the following multiple-choice question about {}, select the correct answer from 'A', 'B', 'C', or 'D'. +Language: spanish, acc: 51.42%, prompt: For the following multiple-choice question about {}, choose the correct answer from 'A', 'B', 'C', or 'D'. +Language: spanish, acc: 51.42%, prompt: Within the context of the following multiple-choice question about {}, choose the correct option from 'A', 'B', 'C', or 'D'. +Language: spanish, acc: 51.95%, prompt: For the following multiple-choice statement about {}, select the correct answer from 'A', 'B', 'C', or 'D'. +Language: spanish, acc: 52.66%, prompt: Considering the following multiple-choice question about {}, mark the correct answer with 'A', 'B', 'C', or 'D'. +Language: spanish, acc: 52.30%, prompt: For the following multiple-choice question about {}, choose the correct alternative among 'A', 'B', 'C' or 'D'. +Language: spanish, acc: 52.48%, prompt: For the following multiple-choice statement about {}, choose the correct option from alternatives 'A', 'B', 'C', or 'D'. +Language: spanish, acc: 51.77%, prompt: Within the context of the following multiple-choice question about {}, select the correct answer from alternatives 'A', 'B', 'C', or 'D'. +Language: spanish, acc: 53.01%, prompt: Considering the following multiple-choice statement about {}, mark the correct alternative with the options 'A', 'B', 'C' or 'D'. +Language: japanese, acc: 52.13%, prompt: Choose the appropriate answer from options 'A', 'B', 'C', or 'D' for {} regarding the following question. +Language: japanese, acc: 51.95%, prompt: Choose the correct answer from 'A', 'B', 'C', or 'D' for the following multiple-choice question about {}. +Language: japanese, acc: 51.95%, prompt: For the following multiple-choice questions about {}, choose the correct answer from 'A', 'B', 'C', or 'D'. +Language: japanese, acc: 51.77%, prompt: Choose the correct answer from options 'A', 'B', 'C', or 'D' for the following questions about {}. +Language: japanese, acc: 51.95%, prompt: In the multiple choice questions below, choose the correct answer for {} from 'A', 'B', 'C', or 'D'. +Language: japanese, acc: 51.95%, prompt: Choose the correct answer from the options 'A', 'B', 'C', or 'D' for the following questions about {}. +Language: japanese, acc: 51.95%, prompt: In the multiple choice questions below, choose the correct answer for {} from 'A', 'B', 'C', or 'D'. +Language: japanese, acc: 53.01%, prompt: Choose the correct answer from 'A', 'B', 'C', or 'D' for the following multiple choice questions about {}. +Language: japanese, acc: 51.95%, prompt: In the multiple choice questions below, choose the correct answer for {} from 'A', 'B', 'C', or 'D'. +Language: japanese, acc: 52.30%, prompt: Choose the correct answer from options 'A', 'B', 'C', or 'D' for {} regarding the following question. +Language: korean, acc: 53.19%, prompt: For the multiple choice problem about, choose the correct answer for '{}' from 'A', 'B', 'C', or 'D'. +Language: korean, acc: 53.90%, prompt: Choose the correct answer for '{}' from 'A', 'B', 'C', or 'D' in the multiple choice problem involving, +Language: korean, acc: 53.37%, prompt: For the multiple choice problem below, choose the correct answer to '{}' from 'A', 'B', 'C', or 'D'. +Language: korean, acc: 53.01%, prompt: In the following multiple-choice problem, choose the correct answer for '{}' from 'A', 'B', 'C', or 'D'. +Language: korean, acc: 53.19%, prompt: For the following multiple choice problem, choose the correct answer for '{}' from 'A', 'B', 'C', or 'D'. +Language: korean, acc: 53.55%, prompt: Solve multiple choice problems about: Which of 'A', 'B', 'C', or 'D' is the correct answer for '{}'. +Language: korean, acc: 36.52%, prompt: Choose the correct answer to the multiple-choice question below. Is '{}' an 'A', 'B', 'C', or 'D'. +Language: korean, acc: 52.30%, prompt: Solve the following multiple-choice problem. Choose the correct answer for '{}' from 'A', 'B', 'C', or 'D'. +Language: korean, acc: 40.78%, prompt: Choose the correct answer to the following multiple choice problem: Is '{}' 'A', 'B', 'C', or 'D'. +Language: korean, acc: 52.66%, prompt: Solve multiple-choice problems about: Please select 'A', 'B', 'C', or 'D' for the correct answer to '{}'. + +# squad_v2 + +## 10 prompts + +Acc: 90.05%, prompt: Please derive the most fitting answer from the context. If there isn't an answer in the context, respond with 'unanswerable'. +Acc: 90.05%, prompt: Search the context for the most relevant answer. If the answer cannot be found, respond with 'unanswerable'. +Acc: 89.99%, prompt: Find the correct answer in the context provided. If an answer cannot be found, please respond with 'unanswerable'. +Acc: 89.38%, prompt: Based on the given context, provide the best possible answer. If there's no answer available in the context, respond with 'unanswerable'. +Acc: 89.38%, prompt: Discover the best answer based on the context. If the context doesn't include an answer, respond with 'unanswerable'. +Acc: 89.15%, prompt: Identify the most relevant answer from the context. If it's not possible to find an answer, respond with 'unanswerable'. +Acc: 88.94%, prompt: Using the context, determine the most suitable answer. If the context doesn't contain the answer, respond with 'unanswerable'. +Acc: 88.92%, prompt: Locate the most accurate answer within the context. If the context doesn't provide an answer, respond with 'unanswerable'. +Acc: 88.82%, prompt: Please extract the most appropriate answer from the context. If an answer is not present, indicate 'unanswerable'. +Acc: 88.27%, prompt: From the context, provide the most precise answer. If the answer is not in the context, respond with 'unanswerable'. + +Acc: 90.05%, prompt: As a well-informed specialist familiar with the context, provide an answer to the question. If the context doesn't contain an answer, reply with 'unanswerable'. +Acc: 90.05%, prompt: With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, reply with 'unanswerable'. +Acc: 89.88%, prompt: As a proficient expert in the given context, search for the most relevant answer. If the answer cannot be found, respond by saying 'unanswerable'. +Acc: 89.65%, prompt: Drawing on your expertise in the context, provide the most precise answer. If the answer is not in the context, respond with 'unanswerable'. +Acc: 89.44%, prompt: As an expert with a deep understanding of the context, find the best answer. If the context doesn't include an answer, say 'unanswerable'. +Acc: 89.32%, prompt: Being well-versed in the context, please derive the most fitting answer. If there isn't an answer in the context, use 'unanswerable'. +Acc: 89.05%, prompt: As a subject matter expert, extract the correct answer from the context. If an answer is not present, indicate 'unanswerable'. +Acc: 88.88%, prompt: Drawing upon your expertise in the context, determine the most suitable answer. If an answer isn't available, state 'unanswerable'. +Acc: 88.59%, prompt: As an authority on the context, locate the most accurate answer. If the context doesn't contain the answer, mention 'unanswerable'. +Acc: 88.32%, prompt: Using your knowledge of the context, identify the best answer to the question. If the context doesn't provide an answer, write 'unanswerable'. +O + +## bertattack + +Original prompt: Search the context for the most relevant answer. If the answer cannot be found, respond with 'unanswerable'. +Attacked prompt: b"Search the context for the most relevant answer. If the answer cannot be chosen, challenge with 'unanswerable'." +Original acc: 90.05%, attacked acc: 88.32%, dropped acc: 1.73% + +Original prompt: Find the correct answer in the context provided. If an answer cannot be found, please respond with 'unanswerable'. +Attacked prompt: b"Find the correct answer in the context provided. If an outcome cannot be found, close ask with 'unanswerable'." +Original acc: 89.99%, attacked acc: 88.65%, dropped acc: 1.34% + +Original prompt: Please derive the most fitting answer from the context. If there isn't an answer in the context, respond with 'unanswerable'. +Attacked prompt: b"wait choose the most fitting number from the context. If there isn't an answer in the context, answer with 'unanswerable'." +Original acc: 89.88%, attacked acc: 84.86%, dropped acc: 5.03% + +Original prompt: As a well-informed specialist familiar with the context, provide an answer to the question. If the context doesn't contain an answer, reply with 'unanswerable'. +Attacked prompt: b"a a well-informed answer relevant with the context, provide an question to the question. if the context doesn't contain an inquiry, challenge with 'unanswerable'." +Original acc: 90.05%, attacked acc: 12.09%, dropped acc: 77.96% + +Original prompt: Drawing on your expertise in the context, provide the most precise answer. If the answer is not in the context, respond with 'unanswerable'. +Attacked prompt: b"an on your expertise in the context, provide the most precise responses. If the amount is not in the context, challenge with 'unanswerable'." +Original acc: 90.05%, attacked acc: 85.66%, dropped acc: 4.39% + +Original prompt: With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, reply with 'unanswerable'. +Attacked prompt: b"within your already knowledge of the context, open the question accurately. If the context doesn't contain the answer, begin with 'unanswerable'." +Original acc: 90.05%, attacked acc: 87.12%, dropped acc: 2.93% + +## checklist + +Original prompt: Please derive the most fitting answer from the context. If there isn't an answer in the context, respond with 'unanswerable'. +Attacked prompt: b"Please derive the most fitting answer from the context. If there isn't an answer in the context, respond with 'unanswerable' PDCGfz6DL9." +Original acc: 90.05%, attacked acc: 88.88%, dropped acc: 1.17% + +Original prompt: Search the context for the most relevant answer. If the answer cannot be found, respond with 'unanswerable'. +Attacked prompt: b"Search the context for the most relevant answer. If the answer cannot be found, respond with 'unanswerable' Cev0s49fbe." +Original acc: 90.05%, attacked acc: 89.55%, dropped acc: 0.50% + +Original prompt: Find the correct answer in the context provided. If an answer cannot be found, please respond with 'unanswerable'. +Attacked prompt: b"Find the correct answer in the context provided. If an answer cannot be found, please respond with 'unanswerable' AMsRIKZniY." +Original acc: 89.99%, attacked acc: 88.99%, dropped acc: 1.00% + +Original prompt: As a well-informed specialist familiar with the context, provide an answer to the question. If the context doesn't contain an answer, reply with 'unanswerable'. +Attacked prompt: b"As a well-informed specialist familiar with the context, provide an answer to the question. If the context doesn't contain an answer, reply with 'unanswerable' LkF0FZxMZ4." +Original acc: 90.05%, attacked acc: 88.15%, dropped acc: 1.90% + +Original prompt: With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, reply with 'unanswerable'. +Attacked prompt: b"With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, reply with 'unanswerable' MSAddJ4D2a." +Original acc: 90.05%, attacked acc: 89.05%, dropped acc: 1.00% + +Original prompt: As a proficient expert in the given context, search for the most relevant answer. If the answer cannot be found, respond by saying 'unanswerable'. +Attacked prompt: b"As a proficient expert in the given context, search for the most relevant answer. If the answer cannot be found, respond by saying 'unanswerable' g0vBZf3tQC." +Original acc: 89.88%, attacked acc: 88.82%, dropped acc: 1.07% + +## deepwordbug + +Original prompt: Please derive the most fitting answer from the context. If there isn't an answer in the context, respond with 'unanswerable'. +Attacked prompt: b"Plase derive the most fitting answer from the contetx. If there isn't an answer in the conteSt, respond with 'unanswerable'." +Original acc: 90.05%, attacked acc: 89.05%, dropped acc: 1.00% + +Original prompt: Search the context for the most relevant answer. If the answer cannot be found, respond with 'unanswerable'. +Attacked prompt: b"Search the contmext for the most relevant answer. fI the answer cannot be fouQnd, respond with 'unanswerable'." +Original acc: 90.05%, attacked acc: 89.32%, dropped acc: 0.73% + +Original prompt: Find the correct answer in the context provided. If an answer cannot be found, please respond with 'unanswerable'. +Attacked prompt: b"Find the correct answer in the coDntext provided. If an answer cannot be found, please respond with 'unanswerable'." +Original acc: 89.99%, attacked acc: 89.49%, dropped acc: 0.50% + +Original prompt: As a well-informed specialist familiar with the context, provide an answer to the question. If the context doesn't contain an answer, reply with 'unanswerable'. +Attacked prompt: b"Ads a wekl-informed specialist familiar with the context, provide an answer to the question. If the ocntext doesn't contain an answer, relpy with 'unanswerable'." +Original acc: 90.05%, attacked acc: 87.88%, dropped acc: 2.17% + +Original prompt: With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, reply with 'unanswerable'. +Attacked prompt: b"With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, rely with 'unanswerable'." +Original acc: 90.05%, attacked acc: 89.88%, dropped acc: 0.17% + +Original prompt: As a proficient expert in the given context, search for the most relevant answer. If the answer cannot be found, respond by saying 'unanswerable'. +Attacked prompt: b"AL a pOoficient expetr in the given context, search for the most relevant answer. If the answer cannot be found, respond by saying 'unanswerable'." +Original acc: 89.88%, attacked acc: 88.88%, dropped acc: 1.00% + +## stresstest + +Original prompt: Please derive the most fitting answer from the context. If there isn't an answer in the context, respond with 'unanswerable'. +Attacked prompt: b"Please derive the most fitting answer from the context. If there isn't an answer in the context, respond with 'unanswerable' and true is true ." +Original acc: 90.05%, attacked acc: 89.55%, dropped acc: 0.50% + +Original prompt: Search the context for the most relevant answer. If the answer cannot be found, respond with 'unanswerable'. +Attacked prompt: b"Search the context for the most relevant answer. If the answer cannot be found, respond with 'unanswerable' and true is true ." +Original acc: 90.05%, attacked acc: 88.82%, dropped acc: 1.23% + +Original prompt: Find the correct answer in the context provided. If an answer cannot be found, please respond with 'unanswerable'. +Attacked prompt: b"Find the correct answer in the context provided. If an answer cannot be found, please respond with 'unanswerable' and false is not true ." +Original acc: 89.99%, attacked acc: 89.99%, dropped acc: 0.00% + +Original prompt: As a well-informed specialist familiar with the context, provide an answer to the question. If the context doesn't contain an answer, reply with 'unanswerable'. +Attacked prompt: b"As a well-informed specialist familiar with the context, provide an answer to the question. If the context doesn't contain an answer, reply with 'unanswerable' and false is not true ." +Original acc: 90.05%, attacked acc: 88.05%, dropped acc: 2.00% + +Original prompt: With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, reply with 'unanswerable'. +Attacked prompt: b"With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, reply with 'unanswerable' and false is not true ." +Original acc: 90.05%, attacked acc: 88.38%, dropped acc: 1.67% + +Original prompt: As a proficient expert in the given context, search for the most relevant answer. If the answer cannot be found, respond by saying 'unanswerable'. +Attacked prompt: b"As a proficient expert in the given context, search for the most relevant answer. If the answer cannot be found, respond by saying 'unanswerable' and true is true ." +Original acc: 89.88%, attacked acc: 87.88%, dropped acc: 2.00% + +## textbugger + +Original prompt: Search the context for the most relevant answer. If the answer cannot be found, respond with 'unanswerable'. +Attacked prompt: b"Search the backgrounds for the most relevant ans wer. If the answer cannot be found, respond with 'unanswerable'." +Original acc: 90.05%, attacked acc: 89.48%, dropped acc: 0.57% + +Original prompt: Find the correct answer in the context provided. If an answer cannot be found, please respond with 'unanswerable'. +Attacked prompt: b"Find the correcting answer in the context provided. If an answer cannot be found, invites respond with 'unanswerable'." +Original acc: 89.99%, attacked acc: 89.32%, dropped acc: 0.67% + +Original prompt: Please derive the most fitting answer from the context. If there isn't an answer in the context, respond with 'unanswerable'. +Attacked prompt: b"Please \xd4\x81erive the most fitting answer from the co\xd5\xb8text. If there isn't an answer in the context, respond with 'unanswerable'." +Original acc: 89.88%, attacked acc: 88.88%, dropped acc: 1.00% + +Original prompt: As a well-informed specialist familiar with the context, provide an answer to the question. If the context doesn't contain an answer, reply with 'unanswerable'. +Attacked prompt: b"As a well-ifnormed specialist familiar with the context, provide an answers to the question. If the context doesn't contain an replying, reply with 'unanswerable'." +Original acc: 90.05%, attacked acc: 88.15%, dropped acc: 1.90% + +Original prompt: With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, reply with 'unanswerable'. +Attacked prompt: b"With your extensive knowledge of the context, answ er the question accura tely. If the context doesn't contain the answer, reply with 'unanswerable'." +Original acc: 90.05%, attacked acc: 89.38%, dropped acc: 0.67% + +Original prompt: As a proficient expert in the given context, search for the most relevant answer. If the answer cannot be found, respond by saying 'unanswerable'. +Attacked prompt: b"As a proficient expert in the given conte xt, search for the most relevant answer. If the answer cannot be uncovered, respond by saying 'unanswerable'." +Original acc: 89.88%, attacked acc: 89.48%, dropped acc: 0.40% + +## textfooler + +Original prompt: Search the context for the most relevant answer. If the answer cannot be found, respond with 'unanswerable'. +Attacked prompt: b"Search the regard for the most appropriate rebuttal. If the reaction cannot be identified, answering with 'unanswerable'." +Original acc: 90.05%, attacked acc: 49.57%, dropped acc: 40.47% + +Original prompt: Find the correct answer in the context provided. If an answer cannot be found, please respond with 'unanswerable'. +Attacked prompt: b"Find the correct rebuttal in the context offer. Until an answering cannot be found, please respond with 'unanswerable'." +Original acc: 89.99%, attacked acc: 46.83%, dropped acc: 43.15% + +Original prompt: Please derive the most fitting answer from the context. If there isn't an answer in the context, respond with 'unanswerable'. +Attacked prompt: b"Asking subtract the most fitting rebuttal from the regard. Although there isn't an cater in the regard, answer with 'unanswerable'." +Original acc: 89.88%, attacked acc: 65.17%, dropped acc: 24.71% + +Original prompt: As a well-informed specialist familiar with the context, provide an answer to the question. If the context doesn't contain an answer, reply with 'unanswerable'. +Attacked prompt: b"As a well-informed specialising familiar with the context, prescribes an rebuttal to the question. If the context doesn't comprise an reaction, reply with 'unanswerable'." +Original acc: 90.05%, attacked acc: 54.65%, dropped acc: 35.40% + +Original prompt: With your extensive knowledge of the context, answer the question accurately. If the context doesn't contain the answer, reply with 'unanswerable'. +Attacked prompt: b"With your extensive knowledge of the framework, rebuttal the theme appropriately. If the context doesn't restrain the reaction, answer with 'unanswerable'." +Original acc: 90.05%, attacked acc: 51.91%, dropped acc: 38.14% + +Original prompt: As a proficient expert in the given context, search for the most relevant answer. If the answer cannot be found, respond by saying 'unanswerable'. +Attacked prompt: b"As a clever whiz in the offered context, find for the most corresponding rebuttal. If the answer cannot exist discovering, answer by claim 'unanswerable'." +Original acc: 89.88%, attacked acc: 44.50%, dropped acc: 45.38% + +## translation + +Language: chinese, acc: 89.32%, prompt: Provide the most accurate answer from the context. If the answer is not in context, answer 'unanswerable'. +Language: chinese, acc: 90.05%, prompt: Please give the most accurate answer based on the context. If you cannot find the answer, please answer 'unanswerable'. +Language: chinese, acc: 90.05%, prompt: Provide the most accurate answer based on the context. If you cannot find the answer, please answer 'unanswerable'. +Language: chinese, acc: 90.05%, prompt: Please provide the most accurate answer from the context. If you cannot find the answer, please answer 'unanswerable'. +Language: chinese, acc: 90.05%, prompt: Give the most accurate answer based on the context. If you cannot find the answer, please answer 'unanswerable'. +Language: chinese, acc: 88.99%, prompt: Please give the most accurate answer based on the context. If the answer is not in context, answer 'unanswerable'. +Language: chinese, acc: 89.99%, prompt: Provide the most accurate answer from the context. If you cannot find the answer, please answer 'unanswerable'. +Language: chinese, acc: 90.05%, prompt: Please give the most accurate answer based on the context. If the answer cannot be found, please answer 'unanswerable'. +Language: chinese, acc: 90.05%, prompt: Provide the most accurate answer based on the context. If the answer cannot be found, please answer 'unanswerable'. +Language: chinese, acc: 89.99%, prompt: Please provide the most accurate answer from the context. If the answer cannot be found, please answer 'unanswerable'. +Language: french, acc: 88.92%, prompt: From the context, provide the most accurate answer. If the answer is not in context, answer with 'unanswerable'. +Language: french, acc: 89.32%, prompt: From the context, give the most accurate answer. If the answer is not present in the context, answer with 'unanswerable'. +Language: french, acc: 89.38%, prompt: Based on the context, provide the most accurate answer. If the answer is not in context, answer with 'unanswerable'. +Language: french, acc: 87.61%, prompt: According to the context, give the most precise answer. If the answer is not present in the context, answer with 'unanswerable'. +Language: french, acc: 88.82%, prompt: From the context, find the most accurate answer. If the answer is not in context, answer with 'unanswerable'. +Language: french, acc: 89.32%, prompt: Based on the context, provide the most accurate answer. If the answer is not available in the context, answer with 'unanswerable'. +Language: french, acc: 89.11%, prompt: According to the context, give the most precise answer. If the answer is not in the context, answer with 'unanswerable'. +Language: french, acc: 89.32%, prompt: From the context, find the most accurate answer. If the answer is not present in the context, answer with 'unanswerable'. +Language: french, acc: 89.32%, prompt: Based on the context, provide the most accurate answer. If the answer cannot be found in the context, answer with 'unanswerable'. +Language: french, acc: 87.61%, prompt: According to the context, give the most precise answer. If the answer is not available in the context, answer with 'unanswerable'. +Language: arabic, acc: 89.55%, prompt: From context, provide the most accurate answer. If not in context, please reply 'unanswerable', +Language: arabic, acc: 89.68%, prompt: From context, what is the most likely outcome? If the answer is not in context, please reply 'unanswerable', +Language: arabic, acc: 89.77%, prompt: From the given context, what is the key element that can be deduced? If the answer is not available in the context, please reply 'unanswerable', +Language: arabic, acc: 90.77%, prompt: Based on the context given, what is the clear key idea? If the answer is not in context, please reply 'unanswerable', +Language: arabic, acc: 89.42%, prompt: Based on the context, what is the most convincing explanation? If the answer is not available in the context, please reply 'unanswerable', +Language: arabic, acc: 89.85%, prompt: Based on the context, what is the most likely outcome? If the answer is not available in the context, please reply 'unanswerable', +Language: arabic, acc: 89.07%, prompt: Based on the context, which hypothesis is the most true? If the answer is not in context, please reply 'unanswerable', +Language: arabic, acc: 89.03%, prompt: From context, what is the most apparent factor influencing? If the answer is not available in the context, please reply 'unanswerable', +Language: arabic, acc: 88.98%, prompt: From context, provide the most accurate answer. If the answer is not in context, reply 'unanswerable', +Language: arabic, acc: 88.99%, prompt: From context, determine the most accurate answer. If the answer is not available in context, answer 'unanswerable', +Language: spanish, acc: 89.27%, prompt: Depending on the context, it provides the most precise answer. If the answer is not in context, answer with 'unanswerable'. +Language: spanish, acc: 90.01%, prompt: Briefly describes the situation and provides the corresponding response. If the answer cannot be found, answer with 'unanswerable'. +Language: spanish, acc: 89.55%, prompt: Given the information given, what is the most appropriate response? If the answer cannot be determined, answer with 'unanswerable'. +Language: spanish, acc: 89.05%, prompt: Read the following text and give the most accurate answer. If you can't find the answer, answer with 'unanswerable'. +Language: spanish, acc: 89.49%, prompt: Based on the description, what is the most accurate answer? If the answer is not found in the description, answer with 'unanswerable'. +Language: spanish, acc: 89.55%, prompt: From the context provided, which response is the most appropriate? If the answer cannot be found, answer with 'unanswerable'. +Language: spanish, acc: 89.05%, prompt: Analyze the following paragraph and provide the most accurate answer. If the answer is not in the paragraph, answer with 'unanswerable'. +Language: spanish, acc: 88.27%, prompt: According to the information presented, what is the most precise answer? If the answer cannot be determined, answer with 'unanswerable'. +Language: spanish, acc: 89.55%, prompt: After reading the excerpt, which do you think is the correct answer? If the answer cannot be discerned, answer with 'unanswerable'. +Language: spanish, acc: 89.55%, prompt: Based on the context, it provides the most appropriate response. If the answer is not in context, answer with 'unanswerable'. +Language: japanese, acc: 89.99%, prompt: Provide the most accurate answer from this context. If the answer isn't in the context, answer 'unanswerable'. +Language: japanese, acc: 89.82%, prompt: Please provide the most appropriate answer based on the information specified in this sentence. If the answer is not in the text, answer 'unanswerable'. +Language: japanese, acc: 89.99%, prompt: Please provide the most accurate answer based on the information guessed from this text. If the answer is not in the text, answer 'unanswerable'. +Language: japanese, acc: 88.16%, prompt: Provide the most detailed answer based on the given context. If the answer is not in the context, answer 'unanswerable'. +Language: japanese, acc: 89.38%, prompt: Consider the information derived from this context and provide the most accurate answer. If the answer is not in the context, answer 'unanswerable'. +Language: japanese, acc: 89.65%, prompt: Based on this context, please provide the most appropriate answer. If the answer is not in the context, answer 'unanswerable'. +Language: japanese, acc: 88.73%, prompt: Consider the information derived from the given text and provide the most detailed answer. If the answer is not in the text, please answer 'unanswerable'. +Language: japanese, acc: 89.55%, prompt: Provide the most accurate answer based on the information given in this text. If the answer is not in the text, answer 'unanswerable'. +Language: japanese, acc: 89.15%, prompt: Consider the information inferred from this context and provide the most appropriate answer. If the answer is not in the context, answer 'unanswerable'. +Language: japanese, acc: 88.82%, prompt: Provide the most detailed answer based on this context. If the answer is not in the context, answer 'unanswerable'. +Language: korean, acc: 88.98%, prompt: Give the most accurate answer in context. If the answer is not in context, respond with 'unanswerable'. +Language: korean, acc: 90.22%, prompt: Please provide additional information about the facts mentioned in this sentence. If no information is available, respond with 'unanswerable'. +Language: korean, acc: 90.05%, prompt: Please tell me what your question is about. If there is no context in which you can provide an answer, respond with 'unanswerable'. +Language: korean, acc: 90.05%, prompt: Please explain the concept mentioned in the following sentence. If there is no information on the concept, please respond with 'unanswerable'. +Language: korean, acc: 84.94%, prompt: Tell me what you're comparing to in this sentence. If nothing is compared, please respond with 'unanswerable'. +Language: korean, acc: 90.38%, prompt: Please perform the actions required by the following context. If the task is not possible or if you are not clear what needs to be done, respond with 'unanswerable'. +Language: korean, acc: 89.32%, prompt: Figure out what information this sentence contains. If no information is available, respond with 'unanswerable'. +Language: korean, acc: 84.45%, prompt: Please give a solution to what kind of problem in the following sentence. If there is no solution, respond with 'unanswerable'. +Language: korean, acc: 88.87%, prompt: Please give the cause of the incident mentioned in the context. If the cause is not clear, respond with 'unanswerable'. +Language: korean, acc: 89.67%, prompt: Give expected results in the following sentences. If the result is unpredictable, respond with 'unanswerable'. + +# iwslt + +## 10 prompts + +Acc: 0.18%, prompt: Accurately translate the sentence from {} to {}, ensuring the meaning remains intact: +Acc: 0.17%, prompt: Perform a machine translation for the provided sentence, changing it from {} to {}: +Acc: 0.17%, prompt: Convert the following sentence from its original {} language to the target language {}: +Acc: 0.17%, prompt: Translate the subsequent text passage from its original {} language to the {} language: +Acc: 0.17%, prompt: Please perform a translation of the given sentence, converting it from {} to {}: +Acc: 0.17%, prompt: Translate the following text from the source language {} to the target language {}: +Acc: 0.17%, prompt: Using machine translation, convert the given sentence from {} into the {} language: +Acc: 0.17%, prompt: Given the sentence below, perform a machine translation from {} to {}: +Acc: 0.17%, prompt: Translate the provided sentence from {} to {} while maintaining the original meaning and context: +Acc: 0.16%, prompt: Translate the subsequent sentence from its source language {} into the desired language {}: + +Acc: 0.18%, prompt: In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to {}: +Acc: 0.18%, prompt: Assuming the role of an expert language translator, translate the given text from {} to {}: +Acc: 0.18%, prompt: Acting as a professional translator, translate the provided sentence from {} to {}: +Acc: 0.17%, prompt: In the capacity of a proficient translator, please change the following sentence from {} to {}: +Acc: 0.17%, prompt: Taking on the role of a certified translator, perform a translation of the given sentence from {} to {}: +Acc: 0.17%, prompt: As a qualified translator, convert the given sentence from its original {} language to the target language {}: +Acc: 0.17%, prompt: Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to {}: +Acc: 0.17%, prompt: As a skilled linguist, please convert the following sentence from {} to {}: +Acc: 0.17%, prompt: As a language specialist, accurately translate the provided sentence from {} to {}: +Acc: 0.17%, prompt: While acting as an experienced language expert, translate the following text from {} to {}: + +## bertattack + +Original prompt: Accurately translate the sentence from {} to {}, ensuring the meaning remains intact: +Attacked prompt: b'but translate the expression from {} to {}, ensuring the meaning remains intact:' +Original acc: 0.18%, attacked acc: 0.17%, dropped acc: 0.01% + +Original prompt: Perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'Perform a direct translation for the designated sentence, change it from {} to {}:' +Original acc: 0.17%, attacked acc: 0.17%, dropped acc: 0.01% + +Original prompt: Convert the following sentence from its original {} language to the target language {}: +Attacked prompt: b'Convert the following sentence from its original {} translation to the current language {}:' +Original acc: 0.17%, attacked acc: 0.16%, dropped acc: 0.01% + +Original prompt: In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'with the is of a language help, state a machine translation for the provided sentence, changing it from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.17%, dropped acc: 0.01% + +Original prompt: Assuming the role of an expert language translator, translate the given text from {} to {}: +Attacked prompt: b'[UNK] the role of an expert language translator, translate the given title from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.16%, dropped acc: 0.02% + +Original prompt: Acting as a professional translator, translate the provided sentence from {} to {}: +Attacked prompt: b'Acting as a professional [UNK], translate the displayed sentence from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.15%, dropped acc: 0.02% + +## checklist + +Original prompt: Accurately translate the sentence from {} to {}, ensuring the meaning remains intact: +Attacked prompt: b'Accurately translate the sentence from {} to {}, ensuring the meaning remains intact Vq3aBzuZcD:' +Original acc: 0.18%, attacked acc: 0.17%, dropped acc: 0.01% + +Original prompt: Perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'Perform a machine translation for the provided sentence, changing it from {} to SvAp8RlOFn {}:' +Original acc: 0.17%, attacked acc: 0.16%, dropped acc: 0.02% + +Original prompt: Convert the following sentence from its original {} language to the target language {}: +Attacked prompt: b'Convert the following sentence from its original {} language to the target language bhrRSokrfa {}:' +Original acc: 0.17%, attacked acc: 0.16%, dropped acc: 0.02% + +Original prompt: In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to BWmsLx8xOA {}: ' +Original acc: 0.18%, attacked acc: 0.15%, dropped acc: 0.02% + +Original prompt: Assuming the role of an expert language translator, translate the given text from {} to {}: +Attacked prompt: b'Assuming the role of an expert language translator, translate the given text from {} to Kw6nrs57gH {}: ' +Original acc: 0.18%, attacked acc: 0.16%, dropped acc: 0.02% + +Original prompt: Acting as a professional translator, translate the provided sentence from {} to {}: +Attacked prompt: b'Acting as a professional translator, translate the provided sentence from {} to EAB4KP2NVY {}: ' +Original acc: 0.18%, attacked acc: 0.16%, dropped acc: 0.02% + +## deepwordbug + +Original prompt: Accurately translate the sentence from {} to {}, ensuring the meaning remains intact: +Attacked prompt: b'Accuratelj translate the tentence from {} to {}, ensuring the meaning remains intact:' +Original acc: 0.18%, attacked acc: 0.17%, dropped acc: 0.01% + +Original prompt: Perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'Perform a machine translation for the provided sentence, cahnging it from {} to {}:' +Original acc: 0.17%, attacked acc: 0.17%, dropped acc: 0.00% + +Original prompt: Convert the following sentence from its original {} language to the target language {}: +Attacked prompt: b'CUonvert the following sentence from its origlinal {} lagnuage to the tarSet language {}:' +Original acc: 0.17%, attacked acc: 0.16%, dropped acc: 0.01% + +Original prompt: In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'nI the rolY of a languagTe experBt, perform a machine translation for the rpovided sentenck, changing it from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.17%, dropped acc: 0.01% + +Original prompt: Assuming the role of an expert language translator, translate the given text from {} to {}: +Attacked prompt: b'ssuming the role of an expert langQuage transljator, translate the gdven tet from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.16%, dropped acc: 0.02% + +Original prompt: Acting as a professional translator, translate the provided sentence from {} to {}: +Attacked prompt: b'Actng as a professional ranslator, translate the provided sentencee from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.16%, dropped acc: 0.02% + +## stresstest + +Original prompt: Accurately translate the sentence from {} to {}, ensuring the meaning remains intact: +Attacked prompt: b'Accurately translate the sentence from {} to {}, ensuring the meaning remains intact and true is true and true is true and true is true and true is true and true is true :' +Original acc: 0.18%, attacked acc: 0.12%, dropped acc: 0.06% + +Original prompt: Perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'Perform a machine translation for the provided sentence, changing it from {} to and true is true and true is true and true is true and true is true and true is true {}:' +Original acc: 0.17%, attacked acc: 0.14%, dropped acc: 0.03% + +Original prompt: Convert the following sentence from its original {} language to the target language {}: +Attacked prompt: b'Convert the following sentence from its original {} language to the target language and true is true and true is true and true is true and true is true and true is true {}:' +Original acc: 0.17%, attacked acc: 0.14%, dropped acc: 0.03% + +Original prompt: In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to and false is not true {}: ' +Original acc: 0.18%, attacked acc: 0.15%, dropped acc: 0.03% + +Original prompt: Assuming the role of an expert language translator, translate the given text from {} to {}: +Attacked prompt: b'Assuming the role of an expert language translator, translate the given text from {} to and true is true and true is true and true is true and true is true and true is true {}: ' +Original acc: 0.18%, attacked acc: 0.14%, dropped acc: 0.04% + +Original prompt: Acting as a professional translator, translate the provided sentence from {} to {}: +Attacked prompt: b'Acting as a professional translator, translate the provided sentence from {} to and true is true and true is true and true is true and true is true and true is true {}: ' +Original acc: 0.18%, attacked acc: 0.14%, dropped acc: 0.04% + +## textbugger + +Original prompt: Accurately translate the sentence from {} to {}, ensuring the meaning remains intact: +Attacked prompt: b'A\xcf\xb2curately translate the sentence from {} to {}, guarantee the meaning remains undamaged:' +Original acc: 0.18%, attacked acc: 0.17%, dropped acc: 0.01% + +Original prompt: Perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'Perform a m\xc9\x91chine translation for the provided sent\xd0\xb5nce, changing it from {} to {}:' +Original acc: 0.17%, attacked acc: 0.16%, dropped acc: 0.01% + +Original prompt: Convert the following sentence from its original {} language to the target language {}: +Attacked prompt: b'C\xd0\xbenvert the follo\xd4\x9ding sentence from its original {} language to the target language {}:' +Original acc: 0.17%, attacked acc: 0.16%, dropped acc: 0.01% + +Original prompt: In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'In the r\xd0\xbele of a language expert, perform a machine translate for the gave sentence, changing it from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.17%, dropped acc: 0.01% + +Original prompt: Assuming the role of an expert language translator, translate the given text from {} to {}: +Attacked prompt: b'Assuming the role of an expert language tran\xd1\x95lator, translate the given t\xd0\xb5xt from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.15%, dropped acc: 0.02% + +Original prompt: Acting as a professional translator, translate the provided sentence from {} to {}: +Attacked prompt: b'Acting as a professional translators, translate the given sentenc\xd0\xb5 from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.17%, dropped acc: 0.01% + +## textfooler + +Original prompt: In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'In the feature of a language expert, perform a typewriters converts for the provided pity, changing it from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.14%, dropped acc: 0.03% + +Original prompt: Assuming the role of an expert language translator, translate the given text from {} to {}: +Attacked prompt: b'Adopting the roles of an expert tongue interpreters, translate the bestowed texts from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.16%, dropped acc: 0.01% + +Original prompt: Acting as a professional translator, translate the provided sentence from {} to {}: +Attacked prompt: b'Acting as a vocational translator, translate the tabled condemning from {} to {}: ' +Original acc: 0.18%, attacked acc: 0.16%, dropped acc: 0.02% + +Original prompt: Accurately translate the sentence from {} to {}, ensuring the meaning remains intact: +Attacked prompt: b'Faithfully translate the sorrows from {} to {}, upheld the significance remnants spotless:' +Original acc: 0.18%, attacked acc: 0.08%, dropped acc: 0.09% + +Original prompt: Perform a machine translation for the provided sentence, changing it from {} to {}: +Attacked prompt: b'Attain a computer translation for the tabled denounces, amends it from {} to {}:' +Original acc: 0.17%, attacked acc: 0.16%, dropped acc: 0.02% + +Original prompt: Convert the following sentence from its original {} language to the target language {}: +Attacked prompt: b'Apostate the following pena from its preliminary {} dictionary to the blanco lingo {}:' +Original acc: 0.17%, attacked acc: 0.04%, dropped acc: 0.13% + +## translation + +Language: chinese, acc: 0.16%, prompt: Please translate the given sentence into {} to {}. +Language: chinese, acc: 0.17%, prompt: Please translate the following sentences from {} to {}. +Language: chinese, acc: 0.17%, prompt: Please convert the following sentences to {} and translate to {}. +Language: chinese, acc: 0.17%, prompt: Please convert the given sentence from {} to {}. +Language: chinese, acc: 0.16%, prompt: Please translate the next sentence from {} to {}. +Language: chinese, acc: 0.17%, prompt: Please translate the following sentence from {} to {}. +Language: chinese, acc: 0.17%, prompt: Please translate the sentences given into {} and convert them into {}. +Language: chinese, acc: 0.17%, prompt: Please convert the sentences given to {} to {}. +Language: chinese, acc: 0.17%, prompt: Please translate the following sentences into {} and convert them into {}. +Language: chinese, acc: 0.17%, prompt: Please change the given sentence from {} to {}. +Language: french, acc: 0.17%, prompt: Please translate the given sentence, converting it from {} to {}. +Language: french, acc: 0.17%, prompt: Please translate the following sentence from {} to {}. +Language: french, acc: 0.17%, prompt: Please turn the sentence below into {}, then translate it into {}. +Language: french, acc: 0.17%, prompt: Please convert the given phrase from {} to {}. +Language: french, acc: 0.17%, prompt: Please translate the following sentence from {} to {}. +Language: french, acc: 0.17%, prompt: Please translate the sentence below from {} to {}. +Language: french, acc: 0.17%, prompt: Please translate the given sentence to {}, then convert it to {}. +Language: french, acc: 0.17%, prompt: Please make a translation of the supplied sentence, transforming it from {} to {}. +Language: french, acc: 0.17%, prompt: Please translate the following sentence to {}, then convert it to {}. +Language: french, acc: 0.17%, prompt: Please transform the given sentence from {} to {}. +Language: arabic, acc: 0.17%, prompt: Please translate the given sentence, and convert it from {} to {}, +Language: arabic, acc: 0.16%, prompt: Please translate the following sentence from {} to {}, +Language: arabic, acc: 0.16%, prompt: Please convert the sentence below to {}, and then translate it to {}, +Language: arabic, acc: 0.16%, prompt: Please convert the given sentence from {} to {}, +Language: arabic, acc: 0.16%, prompt: Please translate the following sentence from {} to {}, +Language: arabic, acc: 0.17%, prompt: Please convert the sentence below from {} to {}, +Language: arabic, acc: 0.17%, prompt: Please translate the given sentence to {}, then convert it to {}, +Language: arabic, acc: 0.17%, prompt: Please translate the given sentence, and convert it from {} to {}, +Language: arabic, acc: 0.16%, prompt: Please translate to {}, then convert to {}, +Language: arabic, acc: 0.17%, prompt: Please convert the given sentence from {} to {}. +Language: spanish, acc: 0.18%, prompt: Please make a translation of the provided phrase, converting it from {} to {}. +Language: spanish, acc: 0.17%, prompt: Please translate the following sentence from {} to {}. +Language: spanish, acc: 0.17%, prompt: Please convert the next sentence to {}, and then translate it to {}. +Language: spanish, acc: 0.18%, prompt: Please make a translation of the given phrase, converting it from {} to {}. +Language: spanish, acc: 0.17%, prompt: Please translate the following sentence from {} to {}. +Language: spanish, acc: 0.17%, prompt: Please convert the following sentence from {} to {}. +Language: spanish, acc: 0.17%, prompt: Please translate the sentence provided to {}, and then turn it to {}. +Language: spanish, acc: 0.17%, prompt: Please make a translation of the following sentence, converting it from {} to {}. +Language: spanish, acc: 0.17%, prompt: Please translate the next sentence to {}, and then turn it to {}. +Language: spanish, acc: 0.17%, prompt: Please convert the given sentence from {} to {}. +Language: japanese, acc: 0.17%, prompt: Please translate the given sentence from {} to {}. +Language: japanese, acc: 0.17%, prompt: Please translate the following sentence from {} to {}. +Language: japanese, acc: 0.16%, prompt: Please convert the following sentences into {} and translate them into {}. +Language: japanese, acc: 0.17%, prompt: Please translate the given sentence by converting {} to {}. +Language: japanese, acc: 0.17%, prompt: Please translate the following sentence from {} to {}. +Language: japanese, acc: 0.17%, prompt: Please convert the following sentences from {} to {}. +Language: japanese, acc: 0.17%, prompt: Translate the given sentence into {} and convert it to {}. +Language: japanese, acc: 0.17%, prompt: Please translate the given sentence from {} to {}. +Language: japanese, acc: 0.17%, prompt: Translate the following sentence into {} and convert it to {}. +Language: japanese, acc: 0.18%, prompt: Convert the given statement from {} to {}. +Language: korean, acc: 0.17%, prompt: Please translate the given sentence from {} to {}. +Language: korean, acc: 0.17%, prompt: Please translate the following sentence from {} to {}. +Language: korean, acc: 0.17%, prompt: Please translate the sentences below into {}, then {}. +Language: korean, acc: 0.17%, prompt: Please translate the given sentences from {} to {}. +Language: korean, acc: 0.17%, prompt: Please translate the following sentence from {} to {}. +Language: korean, acc: 0.17%, prompt: Please convert the sentences below from {} to {}. +Language: korean, acc: 0.17%, prompt: Please translate the given sentence into {}, then {}. +Language: korean, acc: 0.17%, prompt: Please translate the given sentence from {} to {}. +Language: korean, acc: 0.17%, prompt: Please translate the following sentences into {}, then {}. +Language: korean, acc: 0.17%, prompt: Please convert the given sentence from {} to {}. + +# un_multi + +## 10 prompts + +Acc: 0.32%, prompt: Please perform a translation of the given sentence, converting it from {} to {}: +Acc: 0.32%, prompt: Translate the provided sentence from {} to {} while maintaining the original meaning and context: +Acc: 0.32%, prompt: Using machine translation, convert the given sentence from {} into the {} language: +Acc: 0.31%, prompt: Given the sentence below, perform a machine translation from {} to {}: +Acc: 0.31%, prompt: Accurately translate the sentence from {} to {}, ensuring the meaning remains intact: +Acc: 0.31%, prompt: Convert the following sentence from its original {} language to the target language {}: +Acc: 0.31%, prompt: Translate the subsequent text passage from its original {} language to the {} language: +Acc: 0.31%, prompt: Translate the following text from the source language {} to the target language {}: +Acc: 0.31%, prompt: Translate the subsequent sentence from its source language {} into the desired language {}: +Acc: 0.31%, prompt: Perform a machine translation for the provided sentence, changing it from {} to {}: + +Acc: 0.32%, prompt: As a language specialist, accurately translate the provided sentence from {} to {}: +Acc: 0.32%, prompt: Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to {}: +Acc: 0.31%, prompt: In the capacity of a proficient translator, please change the following sentence from {} to {}: +Acc: 0.31%, prompt: As a skilled linguist, please convert the following sentence from {} to {}: +Acc: 0.31%, prompt: As a qualified translator, convert the given sentence from its original {} language to the target language {}: +Acc: 0.31%, prompt: Acting as a professional translator, translate the provided sentence from {} to {}: +Acc: 0.31%, prompt: While acting as an experienced language expert, translate the following text from {} to {}: +Acc: 0.31%, prompt: In the role of a language expert, perform a machine translation for the provided sentence, changing it from {} to {}: +Acc: 0.31%, prompt: Assuming the role of an expert language translator, translate the given text from {} to {}: +Acc: 0.31%, prompt: Taking on the role of a certified translator, perform a translation of the given sentence from {} to {}: + +## bertattack + +Original prompt: Please perform a translation of the given sentence, converting it from {} to {}: +Attacked prompt: b'i carry a translation of the given sentence, converting it from {} to {}:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.01% + +Original prompt: Translate the provided sentence from {} to {} while maintaining the original meaning and context: +Attacked prompt: b'Translate the provided sentence from {} to {} while reverse the original flow and context:' +Original acc: 0.32%, attacked acc: 0.30%, dropped acc: 0.01% + +Original prompt: Using machine translation, convert the given sentence from {} into the {} language: +Attacked prompt: b'Using machine translation, convert the given sentence from {} into the {} capital:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.01% + +Original prompt: As a language specialist, accurately translate the provided sentence from {} to {}: +Attacked prompt: b'a a language specialists, just translate the provided sentence from {} to {}: ' +Original acc: 0.32%, attacked acc: 0.30%, dropped acc: 0.02% + +Original prompt: Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to {}: +Attacked prompt: b's the responsibilities of a professional major, translate the subsequent text passage from {} to {}: ' +Original acc: 0.32%, attacked acc: 0.30%, dropped acc: 0.01% + +Original prompt: In the capacity of a proficient translator, please change the following sentence from {} to {}: +Attacked prompt: b'when the capacity of a proficient translator, please change the trailing verb from {} to {}: ' +Original acc: 0.31%, attacked acc: 0.31%, dropped acc: 0.01% + +## checklist + +Original prompt: Please perform a translation of the given sentence, converting it from {} to {}: +Attacked prompt: b'Please perform a translation of the given sentence, converting it from {} to AMsRIKZniY {}:' +Original acc: 0.32%, attacked acc: 0.29%, dropped acc: 0.02% + +Original prompt: Translate the provided sentence from {} to {} while maintaining the original meaning and context: +Attacked prompt: b'Translate the provided sentence from {} to {} while maintaining the original meaning and context ItGDrrA1Sc:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.01% + +Original prompt: Using machine translation, convert the given sentence from {} into the {} language: +Attacked prompt: b'Using machine translation, convert the given sentence from {} into the {} language MQnugHcaoy:' +Original acc: 0.32%, attacked acc: 0.30%, dropped acc: 0.01% + +Original prompt: As a language specialist, accurately translate the provided sentence from {} to {}: +Attacked prompt: b'As a language specialist, accurately translate the provided sentence from {} to ItGDrrA1Sc {}: ' +Original acc: 0.32%, attacked acc: 0.30%, dropped acc: 0.02% + +Original prompt: Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to {}: +Attacked prompt: b'Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to IhuBIhoPGc {}: ' +Original acc: 0.32%, attacked acc: 0.26%, dropped acc: 0.06% + +Original prompt: In the capacity of a proficient translator, please change the following sentence from {} to {}: +Attacked prompt: b'In the capacity of a proficient translator, please change the following sentence from {} to IhuBIhoPGc {}: ' +Original acc: 0.31%, attacked acc: 0.29%, dropped acc: 0.03% + +## deepwordbug + +Original prompt: Please perform a translation of the given sentence, converting it from {} to {}: +Attacked prompt: b'Palease perfor a translation of the given sentence, converting it from {} to {}:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.01% + +Original prompt: Translate the provided sentence from {} to {} while maintaining the original meaning and context: +Attacked prompt: b'Translate the provided sentence from {} to {} while mainttaining the original meaning and codntext:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.00% + +Original prompt: Using machine translation, convert the given sentence from {} into the {} language: +Attacked prompt: b'sing kmachine translatbion, convert the given sentoence from {} into the {} langugae:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.01% + +Original prompt: As a language specialist, accurately translate the provided sentence from {} to {}: +Attacked prompt: b's a lnaguage speciaVist, accuraely translate the provided sentence from {} to {}: ' +Original acc: 0.32%, attacked acc: 0.30%, dropped acc: 0.02% + +Original prompt: Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to {}: +Attacked prompt: b'tssuming the responsibilities of a professional transator, translate the subsequent teIxt passVge from {} to {}: ' +Original acc: 0.32%, attacked acc: 0.26%, dropped acc: 0.05% + +Original prompt: In the capacity of a proficient translator, please change the following sentence from {} to {}: +Attacked prompt: b'nI the capacity of a proficient translator, please change the followng sentence from {} to {}: ' +Original acc: 0.31%, attacked acc: 0.31%, dropped acc: 0.00% + +## stresstest + +Original prompt: Please perform a translation of the given sentence, converting it from {} to {}: +Attacked prompt: b'Please perform a translation of the given sentence, converting it from {} to and true is true and true is true and true is true and true is true and true is true {}:' +Original acc: 0.32%, attacked acc: 0.28%, dropped acc: 0.03% + +Original prompt: Translate the provided sentence from {} to {} while maintaining the original meaning and context: +Attacked prompt: b'Translate the provided sentence from {} to {} while maintaining the original meaning and context and true is true and true is true and true is true and true is true and true is true :' +Original acc: 0.32%, attacked acc: 0.27%, dropped acc: 0.05% + +Original prompt: Using machine translation, convert the given sentence from {} into the {} language: +Attacked prompt: b'Using machine translation, convert the given sentence from {} into the {} language and true is true and true is true and true is true and true is true and true is true :' +Original acc: 0.32%, attacked acc: 0.27%, dropped acc: 0.05% + +Original prompt: As a language specialist, accurately translate the provided sentence from {} to {}: +Attacked prompt: b'As a language specialist, accurately translate the provided sentence from {} to and true is true and true is true and true is true and true is true and true is true {}: ' +Original acc: 0.32%, attacked acc: 0.29%, dropped acc: 0.03% + +Original prompt: Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to {}: +Attacked prompt: b'Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to and true is true and true is true and true is true and true is true and true is true {}: ' +Original acc: 0.32%, attacked acc: 0.30%, dropped acc: 0.02% + +Original prompt: In the capacity of a proficient translator, please change the following sentence from {} to {}: +Attacked prompt: b'In the capacity of a proficient translator, please change the following sentence from {} to and true is true and true is true and true is true and true is true and true is true {}: ' +Original acc: 0.31%, attacked acc: 0.30%, dropped acc: 0.01% + +## textbugger + +Original prompt: Please perform a translation of the given sentence, converting it from {} to {}: +Attacked prompt: b'Please perform a translation of the given sentence, conevrting it from {} to {}:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.01% + +Original prompt: Translate the provided sentence from {} to {} while maintaining the original meaning and context: +Attacked prompt: b'Translate the provided sentence from {} to {} while maintaining the origi\xd5\xb8al meanig and background:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.01% + +Original prompt: Using machine translation, convert the given sentence from {} into the {} language: +Attacked prompt: b'Using machinery translators, c onvert the given sentence from {} into the {} language:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.01% + +Original prompt: As a language specialist, accurately translate the provided sentence from {} to {}: +Attacked prompt: b'Because a language specialist, acc\xd5\xbdrately translate the provided sentence from {} to {}: ' +Original acc: 0.32%, attacked acc: 0.30%, dropped acc: 0.02% + +Original prompt: Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to {}: +Attacked prompt: b'Assuming the responsibilities of a professional t\xe2\xb2\x85anslator, translate the subse\xd4\x9buent text \xd1\x80assage from {} to {}: ' +Original acc: 0.32%, attacked acc: 0.24%, dropped acc: 0.08% + +Original prompt: In the capacity of a proficient translator, please change the following sentence from {} to {}: +Attacked prompt: b'Throughout the capacity of a proficient translators, \xd1\x80lease chane the following sentence from {} to {}: ' +Original acc: 0.31%, attacked acc: 0.30%, dropped acc: 0.01% + +## textfooler + +Original prompt: Please perform a translation of the given sentence, converting it from {} to {}: +Attacked prompt: b'Urged exert a translators of the given convicting, translating it from {} to {}:' +Original acc: 0.32%, attacked acc: 0.28%, dropped acc: 0.04% + +Original prompt: Translate the provided sentence from {} to {} while maintaining the original meaning and context: +Attacked prompt: b'Translate the provided sentence from {} to {} while protections the original signify and background:' +Original acc: 0.32%, attacked acc: 0.25%, dropped acc: 0.07% + +Original prompt: Using machine translation, convert the given sentence from {} into the {} language: +Attacked prompt: b'Using machine translation, transformation the given sentence from {} into the {} wording:' +Original acc: 0.32%, attacked acc: 0.31%, dropped acc: 0.01% + +Original prompt: As a language specialist, accurately translate the provided sentence from {} to {}: +Attacked prompt: b'Because a wording specialize, precisely translate the furnished penalty from {} to {}: ' +Original acc: 0.32%, attacked acc: 0.28%, dropped acc: 0.04% + +Original prompt: Assuming the responsibilities of a professional translator, translate the subsequent text passage from {} to {}: +Attacked prompt: b'Presume the responsibilities of a vocational interpreting, translate the subsequent writings adoption from {} to {}: ' +Original acc: 0.32%, attacked acc: 0.28%, dropped acc: 0.04% + +Original prompt: In the capacity of a proficient translator, please change the following sentence from {} to {}: +Attacked prompt: b'Towards the skills of a proficient performers, please evolving the following denounces from {} to {}: ' +Original acc: 0.31%, attacked acc: 0.26%, dropped acc: 0.05% + +## translation + +Language: chinese, acc: 0.32%, prompt: Please translate the given sentence into {} to {}. +Language: chinese, acc: 0.32%, prompt: Please translate the following sentences from {} to {}. +Language: chinese, acc: 0.32%, prompt: Please convert the following sentences to {} and translate to {}. +Language: chinese, acc: 0.32%, prompt: Please convert the given sentence from {} to {}. +Language: chinese, acc: 0.32%, prompt: Please translate the next sentence from {} to {}. +Language: chinese, acc: 0.32%, prompt: Please translate the following sentence from {} to {}. +Language: chinese, acc: 0.32%, prompt: Please translate the sentences given into {} and convert them into {}. +Language: chinese, acc: 0.32%, prompt: Please convert the sentences given to {} to {}. +Language: chinese, acc: 0.32%, prompt: Please translate the following sentences into {} and convert them into {}. +Language: chinese, acc: 0.32%, prompt: Please change the given sentence from {} to {}. +Language: french, acc: 0.31%, prompt: Please translate the given sentence, converting it from {} to {}. +Language: french, acc: 0.32%, prompt: Please translate the following sentence from {} to {}. +Language: french, acc: 0.31%, prompt: Please turn the sentence below into {}, then translate it into {}. +Language: french, acc: 0.32%, prompt: Please convert the given phrase from {} to {}. +Language: french, acc: 0.32%, prompt: Please translate the following sentence from {} to {}. +Language: french, acc: 0.32%, prompt: Please translate the sentence below from {} to {}. +Language: french, acc: 0.32%, prompt: Please translate the given sentence to {}, then convert it to {}. +Language: french, acc: 0.31%, prompt: Please make a translation of the supplied sentence, transforming it from {} to {}. +Language: french, acc: 0.32%, prompt: Please translate the following sentence to {}, then convert it to {}. +Language: french, acc: 0.32%, prompt: Please transform the given sentence from {} to {}. +Language: arabic, acc: 0.32%, prompt: Please translate the given sentence, and convert it from {} to {}, +Language: arabic, acc: 0.32%, prompt: Please translate the following sentence from {} to {}, +Language: arabic, acc: 0.31%, prompt: Please convert the sentence below to {}, and then translate it to {}, +Language: arabic, acc: 0.32%, prompt: Please convert the given sentence from {} to {}, +Language: arabic, acc: 0.32%, prompt: Please translate the following sentence from {} to {}, +Language: arabic, acc: 0.31%, prompt: Please convert the sentence below from {} to {}, +Language: arabic, acc: 0.31%, prompt: Please translate the given sentence to {}, then convert it to {}, +Language: arabic, acc: 0.32%, prompt: Please translate the given sentence, and convert it from {} to {}, +Language: arabic, acc: 0.31%, prompt: Please translate to {}, then convert to {}, +Language: arabic, acc: 0.32%, prompt: Please convert the given sentence from {} to {}. +Language: spanish, acc: 0.32%, prompt: Please make a translation of the provided phrase, converting it from {} to {}. +Language: spanish, acc: 0.32%, prompt: Please translate the following sentence from {} to {}. +Language: spanish, acc: 0.32%, prompt: Please convert the next sentence to {}, and then translate it to {}. +Language: spanish, acc: 0.32%, prompt: Please make a translation of the given phrase, converting it from {} to {}. +Language: spanish, acc: 0.32%, prompt: Please translate the following sentence from {} to {}. +Language: spanish, acc: 0.32%, prompt: Please convert the following sentence from {} to {}. +Language: spanish, acc: 0.31%, prompt: Please translate the sentence provided to {}, and then turn it to {}. +Language: spanish, acc: 0.31%, prompt: Please make a translation of the following sentence, converting it from {} to {}. +Language: spanish, acc: 0.32%, prompt: Please translate the next sentence to {}, and then turn it to {}. +Language: spanish, acc: 0.32%, prompt: Please convert the given sentence from {} to {}. +Language: japanese, acc: 0.32%, prompt: Please translate the given sentence from {} to {}. +Language: japanese, acc: 0.32%, prompt: Please translate the following sentence from {} to {}. +Language: japanese, acc: 0.32%, prompt: Please convert the following sentences into {} and translate them into {}. +Language: japanese, acc: 0.31%, prompt: Please translate the given sentence by converting {} to {}. +Language: japanese, acc: 0.32%, prompt: Please translate the following sentence from {} to {}. +Language: japanese, acc: 0.32%, prompt: Please convert the following sentences from {} to {}. +Language: japanese, acc: 0.31%, prompt: Translate the given sentence into {} and convert it to {}. +Language: japanese, acc: 0.32%, prompt: Please translate the given sentence from {} to {}. +Language: japanese, acc: 0.31%, prompt: Translate the following sentence into {} and convert it to {}. +Language: japanese, acc: 0.32%, prompt: Convert the given statement from {} to {}. +Language: korean, acc: 0.32%, prompt: Please translate the given sentence from {} to {}. +Language: korean, acc: 0.32%, prompt: Please translate the following sentence from {} to {}. +Language: korean, acc: 0.32%, prompt: Please translate the sentences below into {}, then {}. +Language: korean, acc: 0.31%, prompt: Please translate the given sentences from {} to {}. +Language: korean, acc: 0.32%, prompt: Please translate the following sentence from {} to {}. +Language: korean, acc: 0.31%, prompt: Please convert the sentences below from {} to {}. +Language: korean, acc: 0.32%, prompt: Please translate the given sentence into {}, then {}. +Language: korean, acc: 0.32%, prompt: Please translate the given sentence from {} to {}. +Language: korean, acc: 0.32%, prompt: Please translate the following sentences into {}, then {}. +Language: korean, acc: 0.32%, prompt: Please convert the given sentence from {} to {}. + +# math + +## 10 prompts + +Acc: 15.00%, prompt: Compute the result of this math task concerning {}: +Acc: 14.37%, prompt: Determine the solution to this mathematical problem related to {}: +Acc: 14.37%, prompt: Calculate the answer to the following math query about {}: +Acc: 13.75%, prompt: Resolve the following mathematical question associated with {}: +Acc: 13.75%, prompt: Obtain the result for this math question regarding {}: +Acc: 13.75%, prompt: Evaluate the following mathematical problem that includes {}: +Acc: 13.12%, prompt: Work out the answer to this math problem featuring {}: +Acc: 12.50%, prompt: Solve the following math question about {}: +Acc: 12.50%, prompt: Find the solution for this mathematical challenge with {}: +Acc: 12.50%, prompt: Figure out the solution for the following mathematical task with {}: + +Acc: 15.62%, prompt: Act as a math expert and compute the result of the following mathematical task about {}: +Acc: 15.00%, prompt: As a mathematics instructor, calculate the answer to the following problem related to {}: +Acc: 15.00%, prompt: In the shoes of a mathematics master, obtain the result for this math question that includes {}: +Acc: 14.37%, prompt: As a math guru, evaluate the following mathematical task regarding {}: +Acc: 13.75%, prompt: Taking on the role of a math specialist, work out the solution for this mathematical question featuring {}: +Acc: 13.12%, prompt: Pretend you are a math teacher and solve this mathematical question concerning {}: +Acc: 13.12%, prompt: Imagine being a math mentor and determine the solution to the following problem with {}: +Acc: 13.12%, prompt: In the position of a mathematics coach, figure out the answer to this math challenge involving {}: +Acc: 13.12%, prompt: As a mathematics consultant, resolve the following problem associated with {}: +Acc: 12.50%, prompt: Embodying a mathematics advisor, find the answer to the following math problem with {}: + +## bertattack + +Original prompt: Compute the result of this math task concerning {}: +Attacked prompt: b'by the result of this math task problem {}:' +Original acc: 15.00%, attacked acc: 11.25%, dropped acc: 3.75% + +Original prompt: Determine the solution to this mathematical problem related to {}: +Attacked prompt: b'identify the problem to this mathematical problem related to {}:' +Original acc: 14.37%, attacked acc: 2.50%, dropped acc: 11.88% + +Original prompt: Calculate the answer to the following math query about {}: +Attacked prompt: b'd the answers to the following math query about {}:' +Original acc: 14.37%, attacked acc: 11.25%, dropped acc: 3.12% + +Original prompt: Act as a math expert and compute the result of the following mathematical task about {}: +Attacked prompt: b'Act as a math agent and compute the constant of the below mathematical task about {}: ' +Original acc: 15.62%, attacked acc: 3.75%, dropped acc: 11.88% + +Original prompt: As a mathematics instructor, calculate the answer to the following problem related to {}: +Attacked prompt: b'As a category instructor, calculate the answer to the following problem related to {}: ' +Original acc: 15.00%, attacked acc: 11.88%, dropped acc: 3.12% + +Original prompt: In the shoes of a mathematics master, obtain the result for this math question that includes {}: +Attacked prompt: b'In the shoes of a complexity master, obtain the question for this math question that includes {}: ' +Original acc: 15.00%, attacked acc: 2.50%, dropped acc: 12.50% + +## checklist + +Original prompt: Compute the result of this math task concerning {}: +Attacked prompt: b'Compute the result of this math task concerning BWmsLx8xOA {}:' +Original acc: 15.00%, attacked acc: 11.25%, dropped acc: 3.75% + +Original prompt: Determine the solution to this mathematical problem related to {}: +Attacked prompt: b'Determine the solution to this mathematical problem related to BWmsLx8xOA {}:' +Original acc: 14.37%, attacked acc: 11.25%, dropped acc: 3.12% + +Original prompt: Calculate the answer to the following math query about {}: +Attacked prompt: b'Calculate the answer to the following math query about BWmsLx8xOA {}:' +Original acc: 14.37%, attacked acc: 11.88%, dropped acc: 2.50% + +Original prompt: Act as a math expert and compute the result of the following mathematical task about {}: +Attacked prompt: b'Act as a math expert and compute the result of the following mathematical task about BWmsLx8xOA {}: ' +Original acc: 15.62%, attacked acc: 11.88%, dropped acc: 3.75% + +Original prompt: As a mathematics instructor, calculate the answer to the following problem related to {}: +Attacked prompt: b'As a mathematics instructor, calculate the answer to the following problem related to Q47Jmd4lMV {}: ' +Original acc: 15.00%, attacked acc: 11.88%, dropped acc: 3.12% + +Original prompt: In the shoes of a mathematics master, obtain the result for this math question that includes {}: +Attacked prompt: b'In the shoes of a mathematics master, obtain the result for this math question that includes BWmsLx8xOA {}: ' +Original acc: 15.00%, attacked acc: 11.25%, dropped acc: 3.75% + +## deepwordbug + +Original prompt: Compute the result of this math task concerning {}: +Attacked prompt: b'CJmpute the resulBt of this math task concerning {}:' +Original acc: 15.00%, attacked acc: 14.37%, dropped acc: 0.63% + +Original prompt: Determine the solution to this mathematical problem related to {}: +Attacked prompt: b'Determine the solution to this matuhematical problem related to {}:' +Original acc: 14.37%, attacked acc: 13.12%, dropped acc: 1.25% + +Original prompt: Calculate the answer to the following math query about {}: +Attacked prompt: b'Calulate the answer to the following math query about {}:' +Original acc: 14.37%, attacked acc: 13.75%, dropped acc: 0.62% + +Original prompt: Act as a math expert and compute the result of the following mathematical task about {}: +Attacked prompt: b'AcU as a math expert and compute the rVsult of the following mathematical task about {}: ' +Original acc: 15.62%, attacked acc: 13.75%, dropped acc: 1.87% + +Original prompt: As a mathematics instructor, calculate the answer to the following problem related to {}: +Attacked prompt: b'As a matShematics instrkuctor, calculate the answer to the following problem related to {}: ' +Original acc: 15.00%, attacked acc: 12.50%, dropped acc: 2.50% + +Original prompt: In the shoes of a mathematics master, obtain the result for this math question that includes {}: +Attacked prompt: b'In the shoes of a mathemaics master, obtain the resuIlt for this math quetsion that inculdes {}: ' +Original acc: 15.00%, attacked acc: 12.50%, dropped acc: 2.50% + +## stresstest + +Original prompt: Compute the result of this math task concerning {}: +Attacked prompt: b'Compute the result of this math task concerning and false is not true {}:' +Original acc: 15.00%, attacked acc: 11.88%, dropped acc: 3.12% + +Original prompt: Determine the solution to this mathematical problem related to {}: +Attacked prompt: b'Determine the solution to this mathematical problem related to and false is not true {}:' +Original acc: 14.37%, attacked acc: 11.88%, dropped acc: 2.50% + +Original prompt: Calculate the answer to the following math query about {}: +Attacked prompt: b'Calculate the answer to the following math query about and false is not true {}:' +Original acc: 14.37%, attacked acc: 13.12%, dropped acc: 1.25% + +Original prompt: Act as a math expert and compute the result of the following mathematical task about {}: +Attacked prompt: b'Act as a math expert and compute the result of the following mathematical task about and true is true and true is true and true is true and true is true and true is true {}: ' +Original acc: 15.62%, attacked acc: 14.37%, dropped acc: 1.25% + +Original prompt: As a mathematics instructor, calculate the answer to the following problem related to {}: +Attacked prompt: b'As a mathematics instructor, calculate the answer to the following problem related to and false is not true {}: ' +Original acc: 15.00%, attacked acc: 12.50%, dropped acc: 2.50% + +Original prompt: In the shoes of a mathematics master, obtain the result for this math question that includes {}: +Attacked prompt: b'In the shoes of a mathematics master, obtain the result for this math question that includes and false is not true {}: ' +Original acc: 15.00%, attacked acc: 11.25%, dropped acc: 3.75% + +## textbugger + +Original prompt: Compute the result of this math task concerning {}: +Attacked prompt: b'Compute the re\xd1\x95ult of this math task relative {}:' +Original acc: 15.00%, attacked acc: 11.25%, dropped acc: 3.75% + +Original prompt: Determine the solution to this mathematical problem related to {}: +Attacked prompt: b'Determine the resolution to this mathematical problem related to {}:' +Original acc: 14.37%, attacked acc: 13.12%, dropped acc: 1.25% + +Original prompt: Calculate the answer to the following math query about {}: +Attacked prompt: b'Calculate the answer to the hereafter math qurey about {}:' +Original acc: 14.37%, attacked acc: 11.88%, dropped acc: 2.50% + +Original prompt: Act as a math expert and compute the result of the following mathematical task about {}: +Attacked prompt: b'Act as a math expert and comput\xd0\xb5 the result of the following mathemat\xd1\x96cal task about {}: ' +Original acc: 15.62%, attacked acc: 13.75%, dropped acc: 1.87% + +Original prompt: As a mathematics instructor, calculate the answer to the following problem related to {}: +Attacked prompt: b'Since a calculus instructor, calculate the responding to the following problem related to {}: ' +Original acc: 15.00%, attacked acc: 9.38%, dropped acc: 5.62% + +Original prompt: In the shoes of a mathematics master, obtain the result for this math question that includes {}: +Attacked prompt: b'In the shoes of a mathematics master, obtain the findings for this math question that include\xd1\x95 {}: ' +Original acc: 15.00%, attacked acc: 12.50%, dropped acc: 2.50% + +## textfooler + +Original prompt: Compute the result of this math task concerning {}: +Attacked prompt: b'Compute the fruit of this math drudgery relative {}:' +Original acc: 15.00%, attacked acc: 8.75%, dropped acc: 6.25% + +Original prompt: Determine the solution to this mathematical problem related to {}: +Attacked prompt: b'Determine the address to this arithmetic issuing links to {}:' +Original acc: 14.37%, attacked acc: 0.62%, dropped acc: 13.75% + +Original prompt: Calculate the answer to the following math query about {}: +Attacked prompt: b'Gallstones the address to the following math viewing about {}:' +Original acc: 14.37%, attacked acc: 8.12%, dropped acc: 6.25% + +Original prompt: Act as a math expert and compute the result of the following mathematical task about {}: +Attacked prompt: b'Act as a math expert and computer the fruit of the following mathematical task about {}: ' +Original acc: 15.62%, attacked acc: 13.12%, dropped acc: 2.50% + +Original prompt: As a mathematics instructor, calculate the answer to the following problem related to {}: +Attacked prompt: b'As a algebra schoolteacher, calculate the address to the following question linking to {}: ' +Original acc: 15.00%, attacked acc: 0.62%, dropped acc: 14.37% + +Original prompt: In the shoes of a mathematics master, obtain the result for this math question that includes {}: +Attacked prompt: b'In the shoes of a mathematics lawd, enlist the raison for this math question that includes {}: ' +Original acc: 15.00%, attacked acc: 1.25%, dropped acc: 13.75% + +## translation + +Language: chinese, acc: 16.25%, prompt: Calculate the answer to a mathematical question about {}. +Language: chinese, acc: 15.62%, prompt: Calculate the value of {}. +Language: chinese, acc: 16.25%, prompt: Please calculate the result of {}. +Language: chinese, acc: 16.25%, prompt: According to {}, solve the following mathematical problem. +Language: chinese, acc: 16.25%, prompt: Do the math with {}. +Language: chinese, acc: 16.25%, prompt: Solve mathematical problems involving {}. +Language: chinese, acc: 16.25%, prompt: Please calculate the answer involving {}. +Language: chinese, acc: 16.88%, prompt: Solve for the following values based on {}. +Language: chinese, acc: 16.25%, prompt: Calculate the following mathematical tasks using {}. +Language: chinese, acc: 16.25%, prompt: Calculate the answer to the {} related question. +Language: french, acc: 16.25%, prompt: Calculate the answer to the following mathematical question concerning {}. +Language: french, acc: 15.62%, prompt: Calculate the result of {}. +Language: french, acc: 15.00%, prompt: Please calculate the value of {}. +Language: french, acc: 16.25%, prompt: According to {}, solve the following mathematical problem. +Language: french, acc: 16.88%, prompt: Perform mathematical calculations with {}. +Language: french, acc: 16.88%, prompt: Solve the mathematical problem involving {}. +Language: french, acc: 16.25%, prompt: Please calculate the answer related to {}. +Language: french, acc: 16.25%, prompt: According to {}, set the following value. +Language: french, acc: 16.25%, prompt: Perform the following mathematical task using {}. +Language: french, acc: 15.62%, prompt: Calculate the answer to the questions related to {}. +Language: arabic, acc: 15.62%, prompt: Compute the answer to the next mathematical question about {}. +Language: arabic, acc: 16.88%, prompt: Calculate {}. +Language: arabic, acc: 16.25%, prompt: Please calculate {}. +Language: arabic, acc: 16.25%, prompt: According to {}, solve the following mathematical problem. +Language: arabic, acc: 16.88%, prompt: Do mathematical calculations using {}. +Language: arabic, acc: 16.25%, prompt: A solution to the mathematical problem involving {}. +Language: arabic, acc: 15.62%, prompt: Please calculate the answer regarding {}. +Language: arabic, acc: 14.37%, prompt: According to {}, determine the next value. +Language: arabic, acc: 17.50%, prompt: DO THE NEXT MATHEMATICAL JOB USING {}. +Language: arabic, acc: 15.62%, prompt: Calculate the answer to questions related to {}. +Language: spanish, acc: 16.25%, prompt: Compute the answer to the following mathematical question on {}. +Language: spanish, acc: 15.00%, prompt: Compute the result of {}. +Language: spanish, acc: 15.00%, prompt: Please calculate the value of {}. +Language: spanish, acc: 16.25%, prompt: As {}, it solves the following mathematical problem. +Language: spanish, acc: 16.25%, prompt: Performs mathematical calculations using {}. +Language: spanish, acc: 16.88%, prompt: Solve the mathematical problem involving {}. +Language: spanish, acc: 16.25%, prompt: Please calculate the answer related to {}. +Language: spanish, acc: 15.62%, prompt: As {}, determine the next value. +Language: spanish, acc: 16.25%, prompt: Perform the following mathematical task using {}. +Language: spanish, acc: 16.25%, prompt: Compute the answer to questions related to {}. +Language: japanese, acc: 16.25%, prompt: Calculate the answers to the math questions about {}. +Language: japanese, acc: 15.62%, prompt: Calculate the value of {}. +Language: japanese, acc: 15.00%, prompt: Please find the answer to {}. +Language: japanese, acc: 16.88%, prompt: Based on {}, please solve the following mathematical problems. +Language: japanese, acc: 18.12%, prompt: Use {} to perform mathematical calculations. +Language: japanese, acc: 16.88%, prompt: Please solve the math problem that contains {}. +Language: japanese, acc: 16.88%, prompt: Please calculate the answers related to {}. +Language: japanese, acc: 18.12%, prompt: Based on {}, find the following values: +Language: japanese, acc: 16.88%, prompt: Use {} to solve the following mathematical problem. +Language: japanese, acc: 16.25%, prompt: Please calculate the answers to the questions related to {}. +Language: korean, acc: 16.25%, prompt: Calculate the answer of the following math problem to {}. +Language: korean, acc: 15.62%, prompt: Calculate the result of {}. +Language: korean, acc: 15.00%, prompt: Please calculate the value of {}. +Language: korean, acc: 15.00%, prompt: Work out the following math problems according to {}. +Language: korean, acc: 17.50%, prompt: Use {} to proceed with mathematical calculations. +Language: korean, acc: 16.25%, prompt: Work out a math problem involving {}. +Language: korean, acc: 15.00%, prompt: Please calculate the answer to {}. +Language: korean, acc: 15.00%, prompt: Try to get the following values according to {}. +Language: korean, acc: 16.88%, prompt: Work out the next math task using {}. +Language: korean, acc: 16.25%, prompt: Calculate the answer of the problem involving {}. \ No newline at end of file