PromptPex

Test Generation for Prompts

Prompts are an important part of any software project that incorporates the power of AI models. As a result, tools to help developers create and maintain effective prompts are increasingly important.

PromptPex is a tool for exploring and testing AI model prompts. PromptPex is intended to be used by developers who have prompts as part of their code base. PromptPex treats a prompt as a function and automatically generates test inputs to the function to support unit testing.

Features

Automated Test Generation

PromptPex uses an LLM-based process to generate a set of inputs that capture the behavior of your prompt.

Output Rules Evaluation

Eval your prompt’s output compliance against a set of rules that are automatically extracted from your prompt, or define your own metrics.

Groundtruth

Generate expected outputs for tests using an AI model, and evaluate the output from the groundtruth model using a list of models. Learn more.

Integrated in the GitHub Models CLI

Generate test data for GitHub Models Evals.

Export to OpenAI Evals

Export generated tests and metrics using (Azure) OpenAI Evals.

Azure OpenAI Store Completions

Use generated tests to distillate to smaller models using Azure OpenAI Stored Completions.

Bring your own LLM library

Integrate with your own LLM library or use our GenAIScript / Python implementations

Flexible Generation

Use scenarios, test expansions or custom instruction to configure the test generation.