Getting started with flex flow#

Authored by:

Learning Objectives - Upon completing this tutorial, you should be able to:

Write LLM application using notebook and visualize the trace of your application.
Convert the application into a flow and batch run against multi lines of data.

0. Install dependent packages#

%%capture --no-stderr
%pip install -r ./requirements.txt

1. Trace your application with promptflow#

Assume we already have a python function that calls OpenAI API.

with open("llm.py") as fin:
    print(fin.read())

Note: before running below cell, please configure required environment variable AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT by create an .env file. Please refer to ../.env.example as an template.

# control the AOAI deployment (model) used in this example
deployment_name = "gpt-4o"

from llm import my_llm_tool

# pls configure `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT` environment variables first
result = my_llm_tool(
    prompt="Write a simple Hello, world! program that displays the greeting message when executed. Output code only.",
    deployment_name=deployment_name,
)
result

Visualize trace by using start_trace#

Note we add @trace in the my_llm_tool function, re-run below cell will collect a trace in trace UI.

from promptflow.tracing import start_trace

# start a trace session, and print a url for user to check trace
start_trace()
# rerun the function, which will be recorded in the trace
result = my_llm_tool(
    prompt="Write a simple Hello, world! program that displays the greeting message when executed. Output code only.",
    deployment_name=deployment_name,
)
result

Now, let’s add another layer of function call. In programmer.py there is a function called write_simple_program, which calls a new function called load_prompt and previous my_llm_tool function.

# show the programmer.py content
with open("programmer.py") as fin:
    print(fin.read())

# call the flow entry function
from programmer import write_simple_program

result = write_simple_program("Java Hello, world!")
result

Setup model configuration with environment variables#

When used in local, create a model configuration object with environment variables.

import os
from dotenv import load_dotenv

from promptflow.core import AzureOpenAIModelConfiguration

if "AZURE_OPENAI_API_KEY" not in os.environ:
    # load environment variables from .env file
    load_dotenv()

if "AZURE_OPENAI_API_KEY" not in os.environ:
    raise Exception("Please specify environment variables: AZURE_OPENAI_API_KEY")
model_config = AzureOpenAIModelConfiguration(
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"],
    api_key=os.environ["AZURE_OPENAI_API_KEY"],
    azure_deployment=deployment_name,
    api_version="2023-07-01-preview",
)

Eval the result#

%load_ext autoreload
%autoreload 2

import paths  # add the code_quality module to the path
from code_quality import CodeEvaluator

evaluator = CodeEvaluator(model_config=model_config)
eval_result = evaluator(result)
eval_result

2. Batch run the function as flow with multi-line data#

Create a flow.flex.yaml file to define a flow which entry pointing to the python function we defined.

# show the flow.flex.yaml content
with open("flow.flex.yaml") as fin:
    print(fin.read())

Batch run with a data file (with multiple lines of test data)#

from promptflow.client import PFClient

pf = PFClient()

data = "./data.jsonl"  # path to the data file
# create run with the flow function and data
base_run = pf.run(
    flow=write_simple_program,
    data=data,
    column_mapping={
        "text": "${data.text}",
    },
    stream=True,
)

details = pf.get_details(base_run)
details.head(10)

3. Evaluate your flow#

Then you can use an evaluation method to evaluate your flow. The evaluation methods are also flows which usually using LLM assert the produced output matches certain expectation.

Run evaluation on the previous batch run#

The base_run is the batch run we completed in step 2 above, for web-classification flow with “data.jsonl” as input.

# we can also run flow pointing to yaml file
eval_flow = "../eval-code-quality/flow.flex.yaml"

eval_run = pf.run(
    flow=eval_flow,
    init={"model_config": model_config},
    data="./data.jsonl",  # path to the data file
    run=base_run,  # specify base_run as the run you want to evaluate
    column_mapping={
        "code": "${run.outputs.output}",
    },
    stream=True,
)

details = pf.get_details(eval_run)
details.head(10)

import json

metrics = pf.get_metrics(eval_run)
print(json.dumps(metrics, indent=4))

pf.visualize([base_run, eval_run])

Next steps#

By now you’ve successfully run your first prompt flow and even did evaluation on it. That’s great!

You can check out more examples:

Basic Chat: demonstrates how to create a chatbot that can remember previous interactions and use the conversation history to generate next message.