Session

A Session is a conversation instance between the user and UFO. It is a continuous interaction that starts when the user initiates a request and ends when the request is completed. UFO supports multiple requests within the same session. Each request is processed sequentially, by a Round of interaction, until the user's request is fulfilled. We show the relationship between Session and Round in the following figure:

Session and Round Image

Session Lifecycle

The lifecycle of a Session is as follows:

1. Session Initialization

A Session is initialized when the user starts a conversation with UFO. The Session object is created, and the first Round of interaction is initiated. At this stage, the user's request is processed by the HostAgent to determine the appropriate application to fulfill the request. The Context object is created to store the state of the conversation shared across all Rounds within the Session.

2. Session Processing

Once the Session is initialized, the Round of interaction begins, which completes a single user request by orchestrating the HostAgent and AppAgent.

3. Next Round

After the completion of the first Round, the Session requests the next request from the user to start the next Round of interaction. This process continues until there are no more requests from the user. The core logic of a Session is shown below:

def run(self) -> None:
    """
    Run the session.
    """

    while not self.is_finished():

        round = self.create_new_round()
        if round is None:
            break
        round.run()

    if self.application_window is not None:
        self.capture_last_snapshot()

    if self._should_evaluate and not self.is_error():
        self.evaluation()

    self.print_cost()

4. Session Termination

If the user has no more requests or decides to end the conversation, the Session is terminated, and the conversation ends. The EvaluationAgent evaluates the completeness of the Session if it is configured to do so.

Reference

Bases: ABC

A basic session in UFO. A session consists of multiple rounds of interactions and conversations.

Initialize a session.

Parameters:
  • task (str) –

    The name of current task.

  • should_evaluate (bool) –

    Whether to evaluate the session.

  • id (int) –

    The id of the session.

Source code in module/basic.py
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
def __init__(self, task: str, should_evaluate: bool, id: int) -> None:
    """
    Initialize a session.
    :param task: The name of current task.
    :param should_evaluate: Whether to evaluate the session.
    :param id: The id of the session.
    """

    self._should_evaluate = should_evaluate
    self._id = id

    # Logging-related properties
    self.log_path = f"logs/{task}/"
    utils.create_folder(self.log_path)

    self._rounds: Dict[int, BaseRound] = {}

    self._context = Context()
    self._init_context()
    self._finish = False
    self._results = {}

    self._host_agent: HostAgent = AgentFactory.create_agent(
        "host",
        "HostAgent",
        configs["HOST_AGENT"]["VISUAL_MODE"],
        configs["HOSTAGENT_PROMPT"],
        configs["HOSTAGENT_EXAMPLE_PROMPT"],
        configs["API_PROMPT"],
    )

application_window property writable

Get the application of the session. return: The application of the session.

context property

Get the context of the session. return: The context of the session.

cost property writable

Get the cost of the session. return: The cost of the session.

current_round property

Get the current round of the session. return: The current round of the session.

evaluation_logger property

Get the logger for evaluation. return: The logger for evaluation.

id property

Get the id of the session. return: The id of the session.

results property writable

Get the evaluation results of the session. return: The evaluation results of the session.

rounds property

Get the rounds of the session. return: The rounds of the session.

session_type property

Get the class name of the session. return: The class name of the session.

step property

Get the step of the session. return: The step of the session.

total_rounds property

Get the total number of rounds in the session. return: The total number of rounds in the session.

_init_context()

Initialize the context of the session.

Source code in module/basic.py
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
def _init_context(self) -> None:
    """
    Initialize the context of the session.
    """

    # Initialize the ID
    self.context.set(ContextNames.ID, self.id)

    # Initialize the log path and the logger
    logger = self.initialize_logger(self.log_path, "response.log")
    request_logger = self.initialize_logger(self.log_path, "request.log")
    eval_logger = self.initialize_logger(self.log_path, "evaluation.log")

    self.context.set(ContextNames.LOG_PATH, self.log_path)

    self.context.set(ContextNames.LOGGER, logger)
    self.context.set(ContextNames.REQUEST_LOGGER, request_logger)
    self.context.set(ContextNames.EVALUATION_LOGGER, eval_logger)

    # Initialize the session cost and step
    self.context.set(ContextNames.SESSION_COST, 0)
    self.context.set(ContextNames.SESSION_STEP, 0)

add_round(id, round)

Add a round to the session.

Parameters:
  • id (int) –

    The id of the round.

  • round (BaseRound) –

    The round to be added.

Source code in module/basic.py
432
433
434
435
436
437
438
def add_round(self, id: int, round: BaseRound) -> None:
    """
    Add a round to the session.
    :param id: The id of the round.
    :param round: The round to be added.
    """
    self._rounds[id] = round

capture_last_snapshot()

Capture the last snapshot of the application, including the screenshot and the XML file if configured.

Source code in module/basic.py
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
def capture_last_snapshot(self) -> None:
    """
    Capture the last snapshot of the application, including the screenshot and the XML file if configured.
    """

    # Capture the final screenshot
    screenshot_save_path = self.log_path + f"action_step_final.png"

    if self.application_window is not None:

        try:
            PhotographerFacade().capture_app_window_screenshot(
                self.application_window, save_path=screenshot_save_path
            )

        except Exception as e:
            utils.print_with_color(
                f"Warning: The last snapshot capture failed, due to the error: {e}",
                "yellow",
            )

        if configs.get("SAVE_UI_TREE", False):
            step_ui_tree = ui_tree.UITree(self.application_window)

            ui_tree_path = os.path.join(self.log_path, "ui_trees")

            ui_tree_file_name = "ui_tree_final.json"

            step_ui_tree.save_ui_tree_to_json(
                os.path.join(
                    ui_tree_path,
                    ui_tree_file_name,
                )
            )

        if configs.get("SAVE_FULL_SCREEN", False):

            desktop_save_path = self.log_path + f"desktop_final.png"

            # Capture the desktop screenshot for all screens.
            PhotographerFacade().capture_desktop_screen_screenshot(
                all_screens=True, save_path=desktop_save_path
            )

        # Save the final XML file
        if configs["LOG_XML"]:
            log_abs_path = os.path.abspath(self.log_path)
            xml_save_path = os.path.join(log_abs_path, f"xml/action_step_final.xml")

            app_agent = self._host_agent.get_active_appagent()
            if app_agent is not None:
                app_agent.Puppeteer.save_to_xml(xml_save_path)

create_following_round()

Create a following round. return: The following round.

Source code in module/basic.py
425
426
427
428
429
430
def create_following_round(self) -> BaseRound:
    """
    Create a following round.
    return: The following round.
    """
    pass

create_new_round() abstractmethod

Create a new round.

Source code in module/basic.py
410
411
412
413
414
415
@abstractmethod
def create_new_round(self) -> Optional[BaseRound]:
    """
    Create a new round.
    """
    pass

evaluation()

Evaluate the session.

Source code in module/basic.py
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
def evaluation(self) -> None:
    """
    Evaluate the session.
    """
    utils.print_with_color("Evaluating the session...", "yellow")
    evaluator = EvaluationAgent(
        name="eva_agent",
        app_root_name=self.context.get(ContextNames.APPLICATION_ROOT_NAME),
        is_visual=configs["APP_AGENT"]["VISUAL_MODE"],
        main_prompt=configs["EVALUATION_PROMPT"],
        example_prompt="",
        api_prompt=configs["API_PROMPT"],
    )

    requests = self.request_to_evaluate()

    # Evaluate the session, first use the default setting, if failed, then disable the screenshot evaluation.
    try:
        result, cost = evaluator.evaluate(
            request=requests,
            log_path=self.log_path,
            eva_all_screenshots=configs.get("EVA_ALL_SCREENSHOTS", True),
        )
    except Exception as e:
        result, cost = evaluator.evaluate(
            request=requests,
            log_path=self.log_path,
            eva_all_screenshots=False,
        )

    # Add additional information to the evaluation result.
    additional_info = {"level": "session", "request": requests, "id": 0}
    result.update(additional_info)

    self.results = result

    self.cost += cost

    evaluator.print_response(result)

    self.evaluation_logger.info(json.dumps(result))

experience_saver()

Save the current trajectory as agent experience.

Source code in module/basic.py
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
def experience_saver(self) -> None:
    """
    Save the current trajectory as agent experience.
    """
    utils.print_with_color(
        "Summarizing and saving the execution flow as experience...", "yellow"
    )

    summarizer = ExperienceSummarizer(
        configs["APP_AGENT"]["VISUAL_MODE"],
        configs["EXPERIENCE_PROMPT"],
        configs["APPAGENT_EXAMPLE_PROMPT"],
        configs["API_PROMPT"],
    )
    experience = summarizer.read_logs(self.log_path)
    summaries, cost = summarizer.get_summary_list(experience)

    experience_path = configs["EXPERIENCE_SAVED_PATH"]
    utils.create_folder(experience_path)
    summarizer.create_or_update_yaml(
        summaries, os.path.join(experience_path, "experience.yaml")
    )
    summarizer.create_or_update_vector_db(
        summaries, os.path.join(experience_path, "experience_db")
    )

    self.cost += cost
    utils.print_with_color("The experience has been saved.", "magenta")

initialize_logger(log_path, log_filename, mode='a', configs=configs) staticmethod

Initialize logging. log_path: The path of the log file. log_filename: The name of the log file. return: The logger.

Source code in module/basic.py
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
@staticmethod
def initialize_logger(
    log_path: str, log_filename: str, mode="a", configs=configs
) -> logging.Logger:
    """
    Initialize logging.
    log_path: The path of the log file.
    log_filename: The name of the log file.
    return: The logger.
    """
    # Code for initializing logging
    logger = logging.Logger(log_filename)

    if not configs["PRINT_LOG"]:
        # Remove existing handlers if PRINT_LOG is False
        logger.handlers = []

    log_file_path = os.path.join(log_path, log_filename)
    file_handler = logging.FileHandler(log_file_path, mode=mode, encoding="utf-8")
    formatter = logging.Formatter("%(message)s")
    file_handler.setFormatter(formatter)
    logger.addHandler(file_handler)
    logger.setLevel(configs["LOG_LEVEL"])

    return logger

is_error()

Check if the session is in error state. return: True if the session is in error state, otherwise False.

Source code in module/basic.py
618
619
620
621
622
623
624
625
def is_error(self):
    """
    Check if the session is in error state.
    return: True if the session is in error state, otherwise False.
    """
    if self.current_round is not None:
        return self.current_round.state.name() == AgentStatus.ERROR.value
    return False

is_finished()

Check if the session is ended. return: True if the session is ended, otherwise False.

Source code in module/basic.py
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
def is_finished(self) -> bool:
    """
    Check if the session is ended.
    return: True if the session is ended, otherwise False.
    """
    if (
        self._finish
        or self.step >= configs["MAX_STEP"]
        or self.total_rounds >= configs["MAX_ROUND"]
    ):
        return True

    if self.is_error():
        return True

    return False

next_request() abstractmethod

Get the next request of the session. return: The request of the session.

Source code in module/basic.py
417
418
419
420
421
422
423
@abstractmethod
def next_request(self) -> str:
    """
    Get the next request of the session.
    return: The request of the session.
    """
    pass

print_cost()

Print the total cost of the session.

Source code in module/basic.py
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
def print_cost(self) -> None:
    """
    Print the total cost of the session.
    """

    if isinstance(self.cost, float) and self.cost > 0:
        formatted_cost = "${:.2f}".format(self.cost)
        utils.print_with_color(
            f"Total request cost of the session: {formatted_cost}$", "yellow"
        )
    else:
        utils.print_with_color(
            "Cost is not available for the model {host_model} or {app_model}.".format(
                host_model=configs["HOST_AGENT"]["API_MODEL"],
                app_model=configs["APP_AGENT"]["API_MODEL"],
            ),
            "yellow",
        )

request_to_evaluate() abstractmethod

Get the request to evaluate. return: The request(s) to evaluate.

Source code in module/basic.py
644
645
646
647
648
649
650
@abstractmethod
def request_to_evaluate(self) -> str:
    """
    Get the request to evaluate.
    return: The request(s) to evaluate.
    """
    pass

run()

Run the session.

Source code in module/basic.py
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
def run(self) -> None:
    """
    Run the session.
    """

    while not self.is_finished():

        round = self.create_new_round()
        if round is None:
            break
        round.run()

    if self.application_window is not None:
        self.capture_last_snapshot()

    if self._should_evaluate and not self.is_error():
        self.evaluation()

    if configs.get("LOG_TO_MARKDOWN", True):

        file_path = self.log_path
        trajectory = Trajectory(file_path)
        trajectory.to_markdown(file_path + "/output.md")

    self.print_cost()