promptflow.evals.evaluate module#
- promptflow.evals.evaluate.evaluate(*, evaluation_name: Optional[str] = None, target: Optional[Callable] = None, data: Optional[str] = None, evaluators: Optional[Dict[str, Callable]] = None, evaluator_config: Optional[Dict[str, Dict[str, str]]] = None, azure_ai_project: Optional[Dict] = None, output_path: Optional[str] = None, **kwargs)#
- Evaluates target or data with built-in or custom evaluators. If both target and data are provided,
data will be run through target function and then results will be evaluated.
- Parameters:
evaluation_name (Optional[str]) – Display name of the evaluation.
target (Optional[Callable]) – Target to be evaluated. target and data both cannot be None
data (Optional[str]) – Path to the data to be evaluated or passed to target if target is set. Only .jsonl format files are supported. target and data both cannot be None
evaluators (Optional[Dict[str, Callable]) – Evaluators to be used for evaluation. It should be a dictionary with key as alias for evaluator and value as the evaluator function.
evaluator_config (Optional[Dict[str, Dict[str, str]]) – Configuration for evaluators. The configuration should be a dictionary with evaluator names as keys and a dictionary of column mappings as values. The column mappings should be a dictionary with keys as the column names in the evaluator input and values as the column names in the input data or data generated by target.
output_path (Optional[str]) – The local folder or file path to save evaluation results to if set. If folder path is provided the results will be saved to a file named evaluation_results.json in the folder.
azure_ai_project (Optional[Dict]) – Logs evaluation results to AI Studio if set.
- Returns:
Evaluation results.
- Return type:
dict
- Example:
Evaluate API can be used as follows:
from promptflow.core import AzureOpenAIModelConfiguration from promptflow.evals.evaluate import evaluate from promptflow.evals.evaluators import RelevanceEvaluator, CoherenceEvaluator model_config = AzureOpenAIModelConfiguration( azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"), api_key=os.environ.get("AZURE_OPENAI_KEY"), azure_deployment=os.environ.get("AZURE_OPENAI_DEPLOYMENT") ) coherence_eval = CoherenceEvaluator(model_config=model_config) relevance_eval = RelevanceEvaluator(model_config=model_config) path = "evaluate_test_data.jsonl" result = evaluate( data=path, evaluators={ "coherence": coherence_eval, "relevance": relevance_eval, }, evaluator_config={ "coherence": { "answer": "${data.answer}", "question": "${data.question}" }, "relevance": { "answer": "${data.answer}", "context": "${data.context}", "question": "${data.question}" } } )