PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts Paper โข 2306.04528 โข Published Jun 7, 2023 โข 3
A Survey on Evaluation of Large Language Models Paper โข 2307.03109 โข Published Jul 6, 2023 โข 42
Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning Paper โข 2308.02533 โข Published Aug 1, 2023
Large Language Models Understand and Can be Enhanced by Emotional Stimuli Paper โข 2307.11760 โข Published Jul 14, 2023 โข 1
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks Paper โข 2309.17167 โข Published Sep 29, 2023 โข 1
PromptBench: A Unified Library for Evaluation of Large Language Models Paper โข 2312.07910 โข Published Dec 13, 2023 โข 15