How Agents Plan Tasks with To-Do Lists

Contents

Contents (1) Example Scenario for Planning Agent (2) Key Components of To-Do Capabilities (2.1) To-do task item (2.2) List of to-do items (2.3) Tool to write and update to-dos (2.4) System prompt update (3) Putting it together in a Middleware Wrapping it up

we all do naturally and regularly. In our personal lives, we often keep to-do lists to organise holidays, errands, and everything in between.

At work, we rely on task trackers and project plans to keep teams aligned. For developers, it is also common to leave TODO comments in the code as reminders for future changes.

Unsurprisingly, LLM agents also benefit from clear to-do lists to guide their planning.

To-do lists help agents plan and track complex tasks more effectively, making them especially useful for multi-tool coordination and long-running operations where progress needs to be visible.

Coding agents like OpenAI Codex, Cline, and Claude Code (which I use regularly) are prime examples of this concept.

They break complex requests into an initial set of steps, organize them as a to-do list with dependencies, and update the plan in real time as tasks are completed or new information arises.

Example of a plan generated by Claude Code agent (Claude-4.5-sonnet) on the right-hand pane of Cursor | Image by author

This clarity enables agents to handle long sequences of actions, coordinate across different tools, and track progress in an understandable manner.

In this article, we dive into how agents utilize to-do list capabilities, analyze the underlying components of the planning process, and demonstrate its implementation with LangChain.

(1) Example Scenario for Planning Agent
(2) Key Components of To-Do Capabilities
(3) Putting it together in a Middleware

The accompanying code is available in this GitHub repo.

(1) Example Scenario for Planning Agent

Let’s look at the example scenario to anchor our walkthrough.

We will set up a single agent to plan a travel itinerary and execute the booking tasks. The agent has access to :

In this example, these tools are mocked and do not perform real bookings; they are included to illustrate the agent’s planning logic and how it uses to-do lists.

Here is the code to implement our planning agent in LangChain:

We input a user query and view the to-do list resulting from the agent’s planning:

To-do list generated by the travel planning agent

The use of structured note-taking via to-do lists enables agents to maintain persistent memory outside the context window. This strategy improves an agent’s ability to manage and retain relevant context over time.

The code setup is straightforward: create_agent creates the LLM agent instance, we pass in the system prompt, select the model (GPT-5.1), and link the tools.

What is noteworthy is the TodoListMiddleware() object that is passed into the middleware parameter.

Firstly, what is LangChain’s middleware?

As the name suggests, it is a middle layer that lets you run custom code before and after LLM calls.

Think of middleware as a programmable layer that allows us to inject code to monitor, adjust, or extend its behavior.

It gives us control and visibility over agents’ behaviors without changing their core logic. It can be used to transform prompts and outputs, manage retries or early exits, and apply safeguards (e.g., guardrails, PII checks).

TodoListMiddleware is a built-in middleware that specifically provides to-do list management capabilities to agents. Next, we explore how the TodoListMiddleware works under the hood.

(2) Key Components of To-Do Capabilities

A planning agent’s to-do list management capabilities boil down to these four key components:

To-do task item
List of to-do items
A tool that writes and updates the to-do list
To-do system prompt update

The TodoListMiddleware brings these elements together to enable an agent’s to-do list capabilities.

Let’s take a closer look at each component and how it is implemented in the to-do middleware code.

(2.1) To-do task item

A to-do item is the smallest unit in a to-do list, representing a single task. It is represented by two fields: task description and current status.

In LangChain, this is modeled as a Todo type, defined using TypedDict:

The content field represents the description of the task that the agent needs to do next, while the status tracks whether the task has not been started (pending), being worked on (in_progress), or finished (completed).

Here is an example of a to-do item:

{
   "content": "Compare flight options from Singapore to Tokyo",
   "status": "completed"
},

(2.2) List of to-do items

Now that we’ve defined the structure of a Todo item, we explore how a set of to-do items is stored and tracked as part of the overall plan.

We define a State object (PlanningState) to capture the plan as a list of to-do items, which will be updated as tasks are performed:

The todos field is optional (NotRequired), meaning it may be absent when first initialized (i.e., the agent may not yet have any tasks in its plan).

OmitFromInput means that todos is managed internally by the middleware and should not be provided directly as user input.

State is the agent’s short-term memory, capturing recent conversations and key information so it can act appropriately based on prior context and information.

In this case, the to-do list remains within the state for the agent to reference and update tasks consistently throughout the session.

Here is an example of a to-do list:

todos: list[Todo] = [
    {
        "content": "Research visa requirements for Singapore passport holders visiting Japan",
        "status": "completed"
    },
    {
        "content": "Compare flight options from Singapore to Tokyo",
        "status": "in_progress"
    },
    {
        "content": "Book flights and hotels once itinerary is finalized",
        "status": "pending"
    }
]

(2.3) Tool to write and update to-dos

With the basic structure of the to-do list established, we now need a tool for the LLM agent to manage and update it as tasks get executed.

Here is the standard way to define our tool (write_todos):

The write_todos tool returns a Command that instructs the agent to update its to-do list and append a message recording the change.

While the write_todos structure is straightforward, the magic lies in the description (WRITE_TODOS_TOOL_DESCRIPTION) of the tool.

When the agent calls the tool, the tool description serves as the critical additional prompt, guiding it on how to use it correctly and what inputs to provide.

Here is LangChain’s (pretty lengthy) expression of the tool description:

We can see that the description is highly structured and precise, defining when and how to use the tool, task states, and management rules.

It also provides clear guidelines for tracking complex tasks, breaking them into clear steps, and updating them systematically.

Feel free to check out Deepagents’ more succinct interpretation of a to-do tool description here

(2.4) System prompt update

The final element of setting up the to-do capability is updating the agent’s system prompt.

It is done by injecting WRITE_TODOS_SYSTEM_PROMPT into the main prompt, explicitly informing the agent that the write_todos tool exists.

It guides the agent on when and why to use the tool, provides context for complex, multi-step tasks, and outlines best practices for maintaining and updating the to-do list:

(3) Putting it together in a Middleware

Finally, all four key components come together in a single class called TodoListMiddleware, which packages the to-do capabilities into a cohesive flow for the agent:

Define PlanningState to track tasks as part of a to-do list
Dynamically create write_todos tool for updating the list and making it accessible to the LLM
Inject WRITE_TODOS_SYSTEM_PROMPT to guide the agent’s planning and reasoning

The WRITE_TODOS_SYSTEM_PROMPT is injected through the middleware’s wrap_model_call (and awrap_model_call) method, which appends it to the agent’s system message for every model call, as shown below:

Wrapping it up

Just like humans, agents use to-do lists to break down complex problems, stay organized, and adapt in real time, enabling them to solve problems more effectively and accurately.

Through LangChain’s middleware implementation, we also gain deeper insight into how planned tasks can be structured, tracked, and executed by agents.

Check out this GitHub repo for the code implementation.