promptflow.evals.evaluate module#

promptflow.evals.evaluate.evaluate(*, evaluation_name: Optional[str] = None, target: Optional[Callable] = None, data: Optional[str] = None, evaluators: Optional[Dict[str, Callable]] = None, evaluator_config: Optional[Dict[str, Dict[str, str]]] = None, azure_ai_project: Optional[Dict] = None, output_path: Optional[str] = None, **kwargs)#
Evaluates target or data with built-in or custom evaluators. If both target and data are provided,

data will be run through target function and then results will be evaluated.

Parameters:
  • evaluation_name (Optional[str]) – Display name of the evaluation.

  • target (Optional[Callable]) – Target to be evaluated. target and data both cannot be None

  • data (Optional[str]) – Path to the data to be evaluated or passed to target if target is set. Only .jsonl format files are supported. target and data both cannot be None

  • evaluators (Optional[Dict[str, Callable]) – Evaluators to be used for evaluation. It should be a dictionary with key as alias for evaluator and value as the evaluator function.

  • evaluator_config (Optional[Dict[str, Dict[str, str]]) – Configuration for evaluators. The configuration should be a dictionary with evaluator names as keys and a dictionary of column mappings as values. The column mappings should be a dictionary with keys as the column names in the evaluator input and values as the column names in the input data or data generated by target.

  • output_path (Optional[str]) – The local folder or file path to save evaluation results to if set. If folder path is provided the results will be saved to a file named evaluation_results.json in the folder.

  • azure_ai_project (Optional[Dict]) – Logs evaluation results to AI Studio if set.

Returns:

Evaluation results.

Return type:

dict

Example:

Evaluate API can be used as follows:

from promptflow.core import AzureOpenAIModelConfiguration
from promptflow.evals.evaluate import evaluate
from promptflow.evals.evaluators import RelevanceEvaluator, CoherenceEvaluator


model_config = AzureOpenAIModelConfiguration(
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_key=os.environ.get("AZURE_OPENAI_KEY"),
    azure_deployment=os.environ.get("AZURE_OPENAI_DEPLOYMENT")
)

coherence_eval = CoherenceEvaluator(model_config=model_config)
relevance_eval = RelevanceEvaluator(model_config=model_config)

path = "evaluate_test_data.jsonl"
result = evaluate(
    data=path,
    evaluators={
        "coherence": coherence_eval,
        "relevance": relevance_eval,
    },
    evaluator_config={
        "coherence": {
            "answer": "${data.answer}",
            "question": "${data.question}"
        },
        "relevance": {
            "answer": "${data.answer}",
            "context": "${data.context}",
            "question": "${data.question}"
        }
    }
)