PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts
AI-generated Key Points
- PromptBench is a robustness benchmark for Large Language Models (LLMs) to measure their resilience to adversarial prompts.
- The study focuses on different types of textual attacks targeting prompts at various levels: character, word, sentence, and semantic.
- Adversarial prompts are used in tasks such as sentiment analysis, natural language inference, reading comprehension, machine translation, and math problem-solving.
- The study evaluates 4,032 adversarial prompts across 8 tasks and 13 datasets with a total of 567,084 test samples.
- Contemporary LLMs are found to be vulnerable to adversarial prompts.
- Word frequency analysis is utilized to provide practical guidance for crafting more robust prompts.
- Code and compiled prompts are publicly accessible for future research on prompt robustness.
- A visualization website is available for easy exploration of adversarial prompts.
- PromptBench categorizes different types of prompts based on their purpose and labeled sample requirements: task-oriented and role-oriented prompts in both zero-shot (ZS) and few-shot (FS) learning scenarios.
- The evaluation includes multiple LLMs with different architectures and sizes across various tasks and domains.
- PromptBench comprises 8 diverse tasks with 13 public datasets covering areas such as sentiment analysis, grammar correctness detection, duplicate sentence detection, natural language inference, multi-task knowledge evaluation through multiple-choice questions, and reading comprehension.
- Datasets used include SST-2, CoLA, QQP, MRPC, MNLI, QNLI, RTE, WNLI, and MMLU.
- Overall findings from PromptBench provide insights into the robustness of LLMs to adversarial prompts and offer practical recommendations for prompt composition.
Authors: Kaijie Zhu, Jindong Wang, Jiaheng Zhou, Zichen Wang, Hao Chen, Yidong Wang, Linyi Yang, Wei Ye, Neil Zhenqiang Gong, Yue Zhang, Xing Xie
Abstract: The increasing reliance on Large Language Models (LLMs) across academia and industry necessitates a comprehensive understanding of their robustness to prompts. In response to this vital need, we introduce PromptBench, a robustness benchmark designed to measure LLMs' resilience to adversarial prompts. This study uses a plethora of adversarial textual attacks targeting prompts across multiple levels: character, word, sentence, and semantic. These prompts are then employed in diverse tasks, such as sentiment analysis, natural language inference, reading comprehension, machine translation, and math problem-solving. Our study generates 4,032 adversarial prompts, meticulously evaluated over 8 tasks and 13 datasets, with 567,084 test samples in total. Our findings demonstrate that contemporary LLMs are vulnerable to adversarial prompts. Furthermore, we present comprehensive analysis to understand the mystery behind prompt robustness and its transferability. We then offer insightful robustness analysis and pragmatic recommendations for prompt composition, beneficial to both researchers and everyday users. We make our code, prompts, and methodologies to generate adversarial prompts publicly accessible, thereby enabling and encouraging collaborative exploration in this pivotal field: https://github.com/microsoft/promptbench.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.