Lesson 3

Multi-Step Agents

16 min

Single-tool agents are straightforward. The difficulty escalates when a task requires five or more sequential tool calls where each step depends on the last, the model may choose a wrong path, and recovery requires understanding why the previous step failed. This lesson builds progressively more capable multi-step agents.

ReAct from Scratch

The ReAct loop encodes reasoning and acting in alternating text blocks. Implementing it explicitly (rather than relying on a framework) gives you full control:

python

SYSTEM_PROMPT = """You are a research assistant. For each task:

1. Write a Thought explaining your reasoning.
2. Write an Action in the format: Action: tool_name({"arg": "value"})
3. Wait for the Observation.
4. Repeat until you have a Final Answer.
5. Write: Final Answer: <your answer>

Available tools: search_web, read_url, calculator, write_file
"""

def parse_action(text: str) -> tuple[str, dict] | None:
    import re, json
    match = re.search(r"Action:\s*(\w+)\((\{.*?\})\)", text, re.DOTALL)
    if not match:
        return None
    tool_name = match.group(1)
    args      = json.loads(match.group(2))
    return tool_name, args

def react_agent(task: str, max_steps: int = 8) -> str:
    history = f"Task: {task}\n"

    for step in range(max_steps):
        response = llm_call(system=SYSTEM_PROMPT, user=history)
        history += response + "\n"

        if "Final Answer:" in response:
            return response.split("Final Answer:")[-1].strip()

        action = parse_action(response)
        if not action:
            history += "Observation: No valid action found. Try again.\n"
            continue

        tool_name, args = action
        result = execute_tool(tool_name, args)
        history += f"Observation: {result}\n"

    return "Reached step limit without a final answer."

Reflection Step

Adding a reflection step after tool failures dramatically reduces stuck loops:

python

REFLECT_PROMPT = """The previous action failed. Analyse what went wrong and suggest a correction.

Failed action: {action}
Error: {error}
Task context: {context}

What should the agent do differently?"""

def reflect_on_error(action: dict, error: str, context: str) -> str:
    return llm_call(
        system = "You are a debugging assistant for AI agents.",
        user   = REFLECT_PROMPT.format(
            action  = json.dumps(action),
            error   = error,
            context = context,
        )
    )

Insert the reflection output back into the agent's history as a synthetic "Observation: [reflection]" so it informs the next action.

Planner-Executor Pattern

For long tasks (10+ steps), decompose planning from execution. The planner generates a step-by-step plan; the executor runs each step:

python

PLANNER_PROMPT = """Break this task into a numbered list of specific, executable steps.
Each step must be independently actionable. Output ONLY the numbered list.

Task: {task}"""

EXECUTOR_PROMPT = """Execute this single step as part of a larger task.

Overall task: {overall_task}
Completed steps: {completed}
Current step: {current_step}

Use the available tools to complete this step, then output: Step Result: <result>"""

def planner_executor_agent(task: str) -> list[str]:
    # Phase 1: plan
    plan_text = llm_call(system="You are a task planner.",
                         user=PLANNER_PROMPT.format(task=task))
    steps = [line.split(". ", 1)[1] for line in plan_text.strip().split("\n")
             if line and line[0].isdigit()]

    completed_steps = []

    # Phase 2: execute each step
    for step in steps:
        result = react_agent(
            task=EXECUTOR_PROMPT.format(
                overall_task   = task,
                completed      = "\n".join(completed_steps) or "None",
                current_step   = step,
            ),
            max_steps=4,
        )
        completed_steps.append(f"{step} → {result}")

    return completed_steps

Comparison of Agent Architectures

| Architecture | Steps | Error recovery | Cost | Best for | |---|---|---|---|---| | Single-loop ReAct | Unlimited | Low (no backtracking) | Low | Simple tool tasks | | ReAct + reflection | Unlimited | Medium (local correction) | Medium | Tasks with unreliable tools | | Planner-executor | Fixed plan | Low (re-plan if needed) | Medium | Long, structured tasks | | Multi-agent | Per-agent | High (peer review) | High | Complex research or coding |

Summary

Implement ReAct explicitly when you need full control over the agent loop, stopping conditions, and history management.
Add a reflection step that fires on tool errors and injects corrective reasoning back into the conversation history.
Use the planner-executor pattern to decompose tasks with 10+ steps: planning and execution require different LLM behaviours.
Set hard step limits on every agent loop — runaway agents are both expensive and difficult to debug after the fact.
Log the full reasoning trace (Thoughts, Actions, Observations) for every production run; it is your only debugging tool when an agent fails silently.

Function Calling and Tool Use Workflow Automation