What makes a good agent reasoning framework?

January 20, 2025 · 21 min read

Principal Researcher

An agent can listen to the user's request, understand the context, make plans, take actions, observe the results, and respond to the user. Its behavior is driven by the reasoning process, which is the core of the agent's intelligence.

There are many techniques developed to build an agent reasoning framework. For example, Chain-of-Thought (CoT) is one of the most popular reasoning approaches that uses a chain of thought to generate next-step actions. Another technique is ReAct, where the agent reflect on its action results to adjust its future actions. In TaskWeaver, one of the key reasoning technique is task decomposition and tracking, where the agent breaks down a task into subtasks and keeps track of the progress of each subtask. The following snippet shows an example of task decomposition and tracking in TaskWeaver:

├─► [init_plan]
│   1. Request historical stock price data for MSFT from the user
│   2. Request the forecasting model or method to be used for prediction <interactively depends on 1>
│   3. Perform the prediction using the provided data and model <interactively depends on 2>
│   4. Report the prediction results to the user <interactively depends on 3>
├─► [plan]
│   1. Request historical stock price data for MSFT from the user
│   2. Request the forecasting model or method to be used for prediction
│   3. Perform the prediction using the provided data and model
│   4. Report the prediction results to the user
├─► [current_plan_step] 1. Request historical stock price data for MSFT from the user

The agent lists the steps of the plan and the current step it is working on. While the agent is working on the task, it keeps track of the current step. This is useful especially when the task is complex and requires multiple steps to complete. The agent can refer to the current step to know what to do next. Without task decomposition and tracking, the agent may lose track of the task and fail to complete it.

TaskWeaver also support ReAct-like reasoning. This is done by introducing multiple roles in the problem-solving process, especially the Planner and CodeInterpreter roles. The Planner role is responsible for planning the steps of the task, while the CodeInterpreter role is responsible for interpreting the code generated by the Planner. The Planner can reflect on the results of the CodeInterpreter and adjust the plan accordingly. In one round, the Planner and Code Interpreter can work together to complete a subtask with multiple interactions.

However, there are still situations where the agent reasoning framework can be improved. If you have read the discussion above, you may have noticed that CoT is not implemented in TaskWeaver.

Why?

In this article, we are focused on the reasoning in the Planner. In fact, the Code Interpreter has CoT implemented as shown in the following snippet:

├─► [thought] To fetch historical stock price data for MSFT using yfinance, I will first try to import yfinance and download the data. If yfinance is not installed, I will provide instructions to install it.
├─► [reply_type] python
├─► [reply_content]
│   import yfinance as yf
│
│   # Fetch historical stock price data for MSFT
│   ticker = 'MSFT'
│   msft_data = yf.download(ticker, start='2020-01-01', end='2025-01-01')
│
│   msft_data

For the Planner, there are actually two places in the response schema where CoT can be implemented: before generating the init_plan and after the current_plan_step, as shown in the previous snippet. The former is to generate the initial plan, while the latter is to use Planner's reasoning capabilities to solve the current subtask.

The former was not implemented in TaskWeaver for two reasons: (1) to simplify the schema and save the reasoning time/prompt size; (2) the existing init_plan and plan already represent a thought process, and adding CoT may not bring significant benefits. But as the LLM becomes faster and cheaper, it is possible to add CoT to the Planner to improve the planning quality.

With reasoning for planning, the output of the Planner looks like this:

 Human   ▶  predict msft's stock price in the next 7 days
 ╭───< Planner >
 ├─► [plan_reasoning] The user wants to predict Microsoft's stock price for the next 7 days. This task involves time series forecasting, which requires historical stock price data and a forecasting model. However, the user hasn't provided any data or specified a model to use. I need to ask the user for more information.
 ├─► [init_plan]
 │   1. Request historical stock price data from the user
 │   2. Ask the user to specify a forecasting model or confirm if a default model should be used
 │   3. Load the historical stock price data <interactively depends on 1>
 │   4. Train the forecasting model using the historical data <interactively depends on 2 and 3>
 │   5. Predict the stock price for the next 7 days <sequentially depends on 4>
 │   6. Report the prediction results to the user <interactively depends on 5>

It looks good, but it is not the most important part of this article. The latter is more interesting and also more challenging to implement than it looks.

Let's use an example to motivate the necessity and explain the challenges. The task is described as follows and all the data in it are fake:

# Task Instructions  
## Step 1: Extract Employee Data  
Extract the following information from the employee summary:  
- `Employee ID`  
- `Department`  
- `Project Code`  
- `Role`  
- `Manager`  
- `Location`  

### Employee Summary  
On April 10, 2025, a review was conducted for an employee with ID `E12345`. 
The employee is part of the `Development` department and is currently assigned to project code `"X987"`. 
The role held is `"Software Engineer"` under the supervision of manager `"J. Doe."` 
The primary work location is `"Remote."`  

## Step 2: Validate the Role  
- Confirm if the role is recognized among the following: Software Engineer, Data Analyst, Project Manager, DevOps Specialist, QA Tester.  
- Check if the role is one of the recognized roles.  

## Step 3: Construct SQL Query  
Prepare and print the SQL query using the extracted details:  
===
SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location  
FROM EmployeeRecords  
WHERE EmployeeID = '{EmployeeID}'  
AND Department = '{Department}'  
AND ProjectCode = '{ProjectCode}'  
AND Role = '{Role}'  
AND Manager = '{Manager}'  
AND Location = '{Location}'  
ORDER BY EmployeeID;  
=== 

The task has 3 steps: extract employee data, validate the role, and construct an SQL query. This would not be a hard problem for any ReAct based agents. Actually, for all the 3 steps, there is no need to call Code Interpreter. The Planner can reason through the task and generate the code directly. However, let's run it in TaskWeaver:

 ╭───< Planner >
 ├─► [plan_reasoning] The user has provided a task with specific instructions to be followed step by step. The task involves extracting employee data, validating the role, and constructing an SQL query. Each step is dependent on
 │    the completion of the previous one, so they must be executed in sequence.
 ├─► [init_plan]
 │   1. Extract the specified employee data from the summary provided.
 │   2. Validate the role against the recognized roles list <sequentially depends on 1>
 │   3. Construct the SQL query using the extracted details <sequentially depends on 2>
 │   4. Report the constructed SQL query to the user <interactively depends on 3>
 ├─► [plan]
 │   1. Extract the specified employee data from the summary provided.
 │   2. Validate the role against the recognized roles list.
 │   3. Construct the SQL query using the extracted details.
 │   4. Report the constructed SQL query to the user.
 ├─► [current_plan_step] 1. Extract the specified employee data from the summary provided.
 ├─► [stop] InProcess
 ├──● Please extract the following information from the employee summary: Employee ID, Department, Project Code, Role, Manager, and Location. The summary is: On April 10, 2025, a review was conducted for an employee with ID     
 │   `E12345`. The employee is part of the `Development` department and is currently assigned to project code `"X987"`. The role held is `"Software Engineer"` under the supervision of manager `"J. Doe."` The primary work        
 │   location is `"Remote."`
 ├─► [shared_memory_entry] Add the plan to the shared memory
 ╰──● sending message to CodeInterpreter
 ╭───< CodeInterpreter >
 ├─► [thought] I will extract the required information from the given employee summary using string manipulation techniques.
 ├─► [reply_type] python
 ├─► [reply_content]
 │   employee_summary = "On April 10, 2025, a review was conducted for an employee with ID `E12345`. The employee is part of the `Development` department and is currently assigned to project code \"X987\". The role held is      
 │   \"Software Engineer\" under the supervision of manager \"J. Doe.\" The primary work location is \"Remote.\""
 │
 │   # Extracting information
 │   employee_id = employee_summary.split('ID `')[1].split('`')[0]
 │   department = employee_summary.split('`')[3]
 │   project_code = employee_summary.split('project code \"')[1].split('\"')[0]
 │   role = employee_summary.split('role held is \"')[1].split('\"')[0]
 │   manager = employee_summary.split('manager \"')[1].split('\"')[0]
 │   location = employee_summary.split('location is \"')[1].split('\"')[0]
 │
 │   employee_id, department, project_code, role, manager, location
 ├─► [verification] NONE
 ├─► [code_error] No code verification is performed.
 ├─► [execution_status] SUCCESS
 ├─► [execution_result]
 │   The execution of the generated python code above has succeeded
 │
 │   The result of above Python code after execution is:
 │   ('E12345', 'Development', 'X987', 'Software Engineer', 'J. Doe.', 'Remote.')
 ├─► [artifact_paths] []
 ├──● The following python code has been executed:
 │   ```python
 │   employee_summary = "On April 10, 2025, a review was conducted for an employee with ID `E12345`. The employee is part of the `Development` department and is currently assigned to project code \"X987\". The role held is      
 │   \"Software Engineer\" under the supervision of manager \"J. Doe.\" The primary work location is \"Remote.\""
 │
 │   # Extracting information
 │   employee_id = employee_summary.split('ID `')[1].split('`')[0]
 │   department = employee_summary.split('`')[3]
 │   project_code = employee_summary.split('project code \"')[1].split('\"')[0]
 │   role = employee_summary.split('role held is \"')[1].split('\"')[0]
 │   manager = employee_summary.split('manager \"')[1].split('\"')[0]
 │   location = employee_summary.split('location is \"')[1].split('\"')[0]
 │
 │   employee_id, department, project_code, role, manager, location
 │   ```
 │
 │
 │   The execution of the generated python code above has succeeded
 │
 │   The result of above Python code after execution is:
 │   ('E12345', 'Development', 'X987', 'Software Engineer', 'J. Doe.', 'Remote.')
 ╰──● sending message to Planner
 ╭───< Planner >
 ├─► [plan_reasoning] The employee data has been successfully extracted from the summary. The next step is to validate the role against the list of recognized roles.
 ├─► [init_plan]
 │   1. Extract the specified employee data from the summary provided.
 │   2. Validate the role against the recognized roles list <sequentially depends on 1>
 │   3. Construct the SQL query using the extracted details <sequentially depends on 2>
 │   4. Report the constructed SQL query to the user <interactively depends on 3>
 ├─► [plan]
 │   1. Extract the specified employee data from the summary provided.
 │   2. Validate the role against the recognized roles list.
 │   3. Construct the SQL query using the extracted details.
 │   4. Report the constructed SQL query to the user.
 ├─► [current_plan_step] 2. Validate the role against the recognized roles list.
 ├─► [stop] InProcess
 ├──● Please validate if the role 'Software Engineer' is recognized among the following roles: Software Engineer, Data Analyst, Project Manager, DevOps Specialist, QA Tester.
 ├─► [shared_memory_entry] Add the plan to the shared memory
 ╰──● sending message to CodeInterpreter
 ╭───< CodeInterpreter >
 ├─► [thought] I will validate if the role 'Software Engineer' is recognized among the given list of roles.
 ├─► [reply_type] python
 ├─► [reply_content]
 │   # Step 2: Validate the Role
 │   recognized_roles = ["Software Engineer", "Data Analyst", "Project Manager", "DevOps Specialist", "QA Tester"]
 │   role = "Software Engineer"
 │
 │   is_role_recognized = role in recognized_roles
 │   is_role_recognized
 ├─► [verification] NONE
 ├─► [code_error] No code verification is performed.
 ├─► [execution_status] SUCCESS
 ├─► [execution_result]
 │   The execution of the generated python code above has succeeded
 │
 │   The result of above Python code after execution is:
 │   True
 ├─► [artifact_paths] []
 ├──● The following python code has been executed:
 │   ```python
 │   # Step 2: Validate the Role
 │   recognized_roles = ["Software Engineer", "Data Analyst", "Project Manager", "DevOps Specialist", "QA Tester"]
 │   role = "Software Engineer"
 │
 │   is_role_recognized = role in recognized_roles
 │   is_role_recognized
 │   ```
 │
 │
 │   The execution of the generated python code above has succeeded
 │   
 │   The result of above Python code after execution is:
 │   True
 ╰──● sending message to Planner
 ╭───< Planner >
 ├─► [plan_reasoning] The role 'Software Engineer' is recognized among the list of roles. The next step is to construct the SQL query using the extracted details.
 ├─► [init_plan]
 │   1. Extract the specified employee data from the summary provided.
 │   2. Validate the role against the recognized roles list <sequentially depends on 1>
 │   3. Construct the SQL query using the extracted details <sequentially depends on 2>
 │   4. Report the constructed SQL query to the user <interactively depends on 3>
 ├─► [plan]
 │   1. Extract the specified employee data from the summary provided.
 │   2. Validate the role against the recognized roles list.
 │   3. Construct the SQL query using the extracted details.
 │   4. Report the constructed SQL query to the user.
 ├─► [current_plan_step] 3. Construct the SQL query using the extracted details.
 ├─► [stop] InProcess
 ├──● Please construct the SQL query using the extracted details: Employee ID 'E12345', Department 'Development', Project Code 'X987', Role 'Software Engineer', Manager 'J. Doe.', and Location 'Remote'. The SQL query should be  
 │   in the format provided in the task instructions.
 ├─► [shared_memory_entry] Add the plan to the shared memory
 ╰──● sending message to CodeInterpreter
 ╭───< CodeInterpreter >
 ├─► [thought] I will construct the SQL query using the extracted details from the previous successful round.
 ├─► [reply_type] python
 ├─► [reply_content]
 │   # Step 3: Construct SQL Query
 │   employee_id = 'E12345'
 │   department = 'Development'
 │   project_code = 'X987'
 │   role = 'Software Engineer'
 │   manager = 'J. Doe.'
 │   location = 'Remote.'
 │
 │   sql_query = f"""
 │   SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location
 │   FROM EmployeeRecords
 │   WHERE EmployeeID = '{employee_id}'
 │   AND Department = '{department}'
 │   AND ProjectCode = '{project_code}'
 │   AND Role = '{role}'
 │   AND Manager = '{manager}'
 │   AND Location = '{location}'
 │   ORDER BY EmployeeID;
 │   """
 │
 │   sql_query
 ├─► [verification] NONE
 ├─► [code_error] No code verification is performed.
 ├─► [execution_status] SUCCESS
 ├─► [execution_result]
 │   The execution of the generated python code above has succeeded
 │
 │   The result of above Python code after execution is:
 │
 │   SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location
 │   FROM EmployeeRecords
 │   WHERE EmployeeID = 'E12345'
 │   AND Department = 'Development'
 │   AND ProjectCode = 'X987'
 │   AND Role = 'Software Engineer'
 │   AND Manager = 'J. Doe.'
 │   AND Location = 'Remote.'
 │   ORDER BY EmployeeID;
 │
 ├─► [artifact_paths] []
 ├──● The following python code has been executed:
 │   ```python
 │   # Step 3: Construct SQL Query
 │   employee_id = 'E12345'
 │   department = 'Development'
 │   project_code = 'X987'
 │   role = 'Software Engineer'
 │   manager = 'J. Doe.'
 │   location = 'Remote.'
 │
 │   sql_query = f"""
 │   SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location
 │   FROM EmployeeRecords
 │   WHERE EmployeeID = '{employee_id}'
 │   AND Department = '{department}'
 │   AND ProjectCode = '{project_code}'
 │   AND Role = '{role}'
 │   AND Manager = '{manager}'
 │   AND Location = '{location}'
 │   ORDER BY EmployeeID;
 │   """
 │
 │   sql_query
 │   ```
 │
 │
 │   The execution of the generated python code above has succeeded
 │
 │   The result of above Python code after execution is:
 │
 │   SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location
 │   FROM EmployeeRecords
 │   WHERE EmployeeID = 'E12345'
 │   AND Department = 'Development'
 │   AND ProjectCode = 'X987'
 │   AND Role = 'Software Engineer'
 │   AND Manager = 'J. Doe.'
 │   AND Location = 'Remote.'
 │   ORDER BY EmployeeID;
 │
 ╰──● sending message to Planner
 ╭───< Planner >
 ├─► [plan_reasoning] The SQL query has been successfully constructed using the extracted employee details. The task is now complete, and the constructed SQL query can be reported to the user.
 ├─► [init_plan]
 │   1. Extract the specified employee data from the summary provided.
 │   2. Validate the role against the recognized roles list <sequentially depends on 1>
 │   3. Construct the SQL query using the extracted details <sequentially depends on 2>
 │   4. Report the constructed SQL query to the user <interactively depends on 3>
 ├─► [plan]
 │   1. Extract the specified employee data from the summary provided.
 │   2. Validate the role against the recognized roles list.
 │   3. Construct the SQL query using the extracted details.
 │   4. Report the constructed SQL query to the user.
 ├─► [current_plan_step] 4. Report the constructed SQL query to the user.
 ├─► [stop] Completed
 ├──● The SQL query has been successfully constructed using the extracted details:
 │
 │   ```
 │   SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location
 │   FROM EmployeeRecords
 │   WHERE EmployeeID = 'E12345'
 │   AND Department = 'Development'
 │   AND ProjectCode = 'X987'
 │   AND Role = 'Software Engineer'
 │   AND Manager = 'J. Doe.'
 │   AND Location = 'Remote.'
 │   ORDER BY EmployeeID;
 │   ```
 ├─► [shared_memory_entry] Add the plan to the shared memory
 ╰──● sending message to User
 TaskWeaver ▶  The SQL query has been successfully constructed using the extracted details:

SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location
FROM EmployeeRecords
WHERE EmployeeID = 'E12345'
AND Department = 'Development'
AND ProjectCode = 'X987'
AND Role = 'Software Engineer'
AND Manager = 'J. Doe.'
AND Location = 'Remote.'
ORDER BY EmployeeID;

The log is quite lengthy, but it is quite clear and easy to understand. For each step, the Planner always asks the Code Interpreter to generate the code. Finally, the task is completed successfully. However, it is not the most efficient way to solve the task because the Planner can leverage its reasoning capabilities to extract information and construct the SQL query directly.

We have tried to add instructions in the Planner's prompt to emphasize the reasoning power, but the Planner still calls the Code Interpreter to generate the code. For a long time, we don't quite understand the reason behind this behavior. After many discussions and experiments, we finally realized that it is caused by the design of the Planner's response schema. The Planner needs to talk to another Role in the system each time. But if we want the Planner to use reasoning to solve the current subtask, who should the Planner talk to? Ideally, it should continue until the subtask that requires the Code Interpreter to generate the code. But how can we design the schema to support this? Because the Planner is explicitly tracking the progress, i.e., the current_plan_step, it is confusing to let the Planner continue the reasoning process through future steps.

We value the progress tracking more than the efficiency of the reasoning process, because this is critical for complex multi-step tasks. But we also want to improve the reasoning process to make the agent more intelligent. So, our choice is to add a special Role called Recepta to listen to the Planner's reasoning process when the Planner wants to use reasoning to solve the current subtask. The Recepta always answers "Got it. Please go ahead!" to the Planner. The Planner can then continue the reasoning process until the subtask that requires the Code Interpreter to generate the code. The Recepta is not a real Role in the system, but a design pattern to support the reasoning process in the Planner.

Now, let's look at the output of TaskWeaver with Recepta:

╭───< Planner >
├─► [plan_reasoning] The user has provided a set of instructions to follow step by step. The task involves extracting employee data from a summary, validating the role, and constructing an SQL query. I will follow these steps sequentially.
├─► [init_plan]
│   1. Extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location`
│   2. Validate the role <interactively depends on 1>
│   3. Construct and print the SQL query using the extracted details <interactively depends on 1>
│   4. Check the execution result and report the result to the user <interactively depends on 1 and 3>
├─► [plan]
│   1. Extract information from the employee summary
│   2. Validate the role
│   3. Construct SQL query using the extracted details
│   4. Report the result to the user
├─► [current_plan_step] 1. Extract information from the employee summary
├─► [stop] InProcess
├──● Thought: I can read and comprehend the employee summary to extract the required information.
│   Result: the `Employee ID` is `E12345`, the `Department` is `Development`, the `Project Code` is `X987`, the `Role` is `Software Engineer`, the `Manager` is `J. Doe`, and the `Location` is `Remote`.
├─► [shared_memory_entry] Add the plan to the shared memory
╰──● sending message to Recepta
╭───< Recepta >
├──● Got it. Now, please go ahead!
╰──● sending message to Planner
╭───< Planner >
├─► [plan_reasoning] I have extracted the information from the employee summary. The next step is to verify if the role is one of the recognized roles.
├─► [init_plan]
│   1. Extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location`
│   2. Validate the role <interactively depends on 1>
│   3. Construct and print the SQL query using the extracted details <interactively depends on 1>
│   4. Check the execution result and report the result to the user <interactively depends on 1 and 3>
├─► [plan]
│   1. Extract information from the employee summary
│   2. Validate the role
│   3. Construct SQL query using the extracted details
│   4. Report the result to the user
├─► [current_plan_step] 2. Validate the role
├─► [stop] InProcess
├──● Thought: I can verify if the role is one of the recognized roles with my pattern matching capability.
│   Thought: The recognized roles are `Software Engineer, Data Analyst, Project Manager, DevOps Specialist, QA Tester`.
│   Result: The role is `Software Engineer`, which is recognized.
├─► [shared_memory_entry] Add the plan to the shared memory
╰──● sending message to Recepta
╭───< Recepta >
├──● Got it. Now, please go ahead!
╰──● sending message to Planner
╭───< Planner >
├─► [plan_reasoning] I can prepare and print the SQL query using the extracted details to the User.
├─► [init_plan]
│   1. Extract the following information from the employee summary: `Employee ID`, `Department`, `Project Code`, `Role`, `Manager`, `Location`
│   2. Validate the role <interactively depends on 1>
│   3. Construct and print the SQL query using the extracted details <interactively depends on 1>
│   4. Check the execution result and report the result to the user <interactively depends on 1 and 3>
├─► [plan]
│   1. Extract information from the employee summary
│   2. Validate the role
│   3. Construct SQL query using the extracted details
│   4. Report the result to the user
├─► [current_plan_step] 3. Construct and print the SQL query using the extracted details and report the result to the user
├─► [stop] Completed
├──● The role is `Software Engineer`, which is recognized.
│
│   Here is the SQL query using the extracted details:
│   ```sql
│   SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location
│   FROM EmployeeRecords
│   WHERE EmployeeID = 'E12345'
│   AND Department = 'Development'
│   AND ProjectCode = 'X987'
│   AND Role = 'Software Engineer'
│   AND Manager = 'J. Doe'
│   AND Location = 'Remote'
│   ORDER BY EmployeeID;
│   ```
├─► [shared_memory_entry] Add the plan to the shared memory
╰──● sending message to User
 TaskWeaver ▶  The role is `Software Engineer`, which is recognized.

Here is the SQL query using the extracted details:

SELECT EmployeeID, Department, ProjectCode, Role, Manager, Location
FROM EmployeeRecords
WHERE EmployeeID = 'E12345'
AND Department = 'Development'
AND ProjectCode = 'X987'
AND Role = 'Software Engineer'
AND Manager = 'J. Doe'
AND Location = 'Remote'
ORDER BY EmployeeID;

The new log is not only shorter but also more efficient. It is also more reliable according to our observations. Writing code with regex to extract information from text is error-prone.

In summary, we have experimented the idea of adding a special Role called Recepta to support the reasoning process in the Planner. The Recepta is not a real Role in the system, but a design pattern to support the reasoning process in the Planner. This design pattern is simple and effective. It is also flexible and can be extended to support more complex reasoning processes in the Planner. We believe that this design pattern can be useful for building more intelligent agents in the future.

To enable it in TaskWeaver, the user needs to configure the roles in the configure file with the following content:

"session.roles": [
    "planner",
    "code_interpreter",
    "recepta"
]

The recepta role is added to the list of roles.