ReAct in production: reasoning that survives sidetracks

ReAct in production: reasoning that survives sidetracks

ReAct is a clean idea: think, act, observe, repeat. In production, the loop is the part that breaks. The model thinks reasonably for the first few steps, then either over-explains, gets stuck second-guessing a tool call, or convinces itself the task is already done. The textbook diagram doesn’t show any of this.

Where the loop drifts

The most common failure isn’t a wrong tool call — it’s a redundant one. The model retries an action that already succeeded because it lost track of state in the conversation history. Step counts climb, latency climbs with them, and the user sees a slow response that’s secretly a chain of identical lookups. The next most common failure is premature termination: the model declares success on a partial result because it pattern-matched to “I have an answer” rather than “I have the right answer.”

Patches that keep it on the rails

A short scratchpad that summarizes prior actions — not the full history, just outcomes — cuts redundant calls more than any prompt rewrite. A separate “are we done?” check, run by the same model on a different prompt, catches premature termination far better than letting the loop self-judge. Cap the loop. Always cap the loop.

ReAct works in production. The published version of the prompt is rarely the version that ships.

Related Posts

Designing an agent harness that doesn't fight the model

Designing an agent harness that doesn't fight the model

Lorem ipsum dolor sit amet consectetur adipisicing ...

How autonomous is too autonomous

How autonomous is too autonomous

Autonomy in agents is a slider, not a switch, and ...

Agent memory: episodic, semantic, and what to keep

Agent memory: episodic, semantic, and what to keep

The first agent you build has no memory beyond the ...

Planner-executor splits: when to separate them

Planner-executor splits: when to separate them

A single model doing both planning and execution f ...