OpenAI Evals
PromptPex support exporting the generated tests into a OpenAI Evals Run. PromptPex will generate an eval and launch an eval run for each Model Under Test (MUT) in the test generation.
Configuration
Section titled “Configuration”To enable this mode, you need to
- set the
OPENAI_API_KEY
environment variable to your OpenAI API key - set the
createEvalRuns
parameter to true in the web interface or on the command line.
The OpenAI models that can be used as Model Under Test are available at OpenAI Models.
Here’s a video showing the use of OpenAI evals in action. In the demo, we show how PromptPex can generate a test the can measure how effectively 2 OpenAI models understand sarcasm.