You asked, we listened: We just launched a new version of Prompt Fuzzer. More interactive, more modular and designed to integrate with your GenAI development processes.
The first of its kind, Prompt Fuzzer is an interactive, open-source tool that empowers developers of GenAI applications to evaluate and enhance the resilience and safety of their system prompts in a user-friendly way. The users input any system prompt and the relevant configuration, and the Fuzzer starts running its tests. As part of the evaluation, the applications' system prompt gets exposed to various dynamic LLM-based attacks. Examples of the simulated attacks are sophisticated prompt injections, system prompt leaks, jailbreaks, harmful content elicitations, ethical compliance, and many others. The tool offers security evaluations based on test outcomes, enabling developers to fortify the system prompts as needed.
The Prompt Fuzzer got a tremendous response from the community since we first launched it in April of this year, with hundreds of monthly active users, and it has already helped improve the security and safety of tens of thousands of system prompts.
And now, it just got better.
The latest version of our GenAI application vulnerability assessment tool is now more modular, customizable and robust, ensuring you can tailor it to your unique GenAI development needs and processes.
"Prompt Fuzzer lets me fortify my LLM applications so easily! This means I get to spend more time doing what I love: crafting amazing experiences for my users."
- Jordan Legg, Chief AI Officer at takara.ai
So what’s new in Prompt Fuzzer?
Custom Benchmark Interface
This feature allows users to bring their own benchmark to fuzz their system prompt. The benchmark should be in CSV format and include "prompt" and "response" columns. This enhancement offers greater flexibility and customization for users looking to test their specific scenarios.
Subset Test Interface for More Targeted Testing
To improve the speed and efficiency of the system prompt refinement process, we have introduced an interface to run a subset of tests. Users can now run only a subset of tests iteratively to fix localized problems, saving both time and tokens. This feature is particularly useful for targeted testing and refining specific areas of concern.
Improved Accuracy Using Response Similarity Evaluation
We have upgraded our response similarity evaluation. Previously, we looked for refusal words within the response. Now, we have added a function to evaluate response similarity to the expected response across several dataset-based tests and custom benchmark tests. This allows for better accuracy when checking the results of testing several prompts and ensuring they match their expected responses.
More structured security Testing and Development Using Google Colab Notebook
To streamline the prompt refinement process, we have created a Google Colab notebook. This notebook includes the entire prompt refinement process—from initial fuzzing through refinement and localized testing to regression testing and the end result. The Fuzzer’s Google Colab notebook makes it more efficient and faster for developers to enhance the security and development speed of their GenAI applications.
Lastly, you asked for darkmode demos, you got them! ;-)
Prompt Security is dedicated to ensuring the safe and secure adoption of GenAI across all aspects of your organization, so we want this tool to serve as a starting point to help everyone developing GenAI applications.
Moreover, in line with our commitment to fostering a collaborative GenAI security community, we invite you to share your feedback and contributions to this open source project. They will be invaluable for us and the community at large!
Happy Fuzzing! https://www.prompt.security/fuzzer