How autonomous is too autonomous

How autonomous is too autonomous

Autonomy in agents is a slider, not a switch, and the right setting depends on the task more than the technology. The instinct to push the slider toward “fully autonomous” is real because it makes the demo look magical. The cost shows up in the support queue, when an agent ran an irreversible action under uncertainty and nobody was there to catch it.

The settings worth distinguishing

At the assistive end, the agent suggests, the human commits. At the supervised end, the agent acts but every consequential action requires confirmation. At the bounded-autonomous end, the agent acts within a sandbox — limited tool surface, limited scope, fully reversible. At the fully autonomous end, the agent runs without supervision and the burden of reliability is entirely on the system. Most production agents should live at “bounded autonomous” and most teams accidentally drift past it.

The drift you don’t notice

The drift toward more autonomy happens one tool at a time. A new capability is added because a user asked for it; a confirmation step is removed because users complained about friction; a permission is broadened because edge cases didn’t fit the original boundary. Each change is reasonable; the cumulative drift produces an agent more autonomous than anyone designed.

The right level of autonomy is the one where your worst-case incident is still recoverable. If you cannot answer “what’s the worst-case incident,” your agent is probably more autonomous than it should be.

Related Posts

Designing an agent harness that doesn't fight the model

Designing an agent harness that doesn't fight the model

Lorem ipsum dolor sit amet consectetur adipisicing ...

Agent memory: episodic, semantic, and what to keep

Agent memory: episodic, semantic, and what to keep

The first agent you build has no memory beyond the ...

Planner-executor splits: when to separate them

Planner-executor splits: when to separate them

A single model doing both planning and execution f ...

ReAct in production: reasoning that survives sidetracks

ReAct in production: reasoning that survives sidetracks

ReAct is a clean idea: think, act, observe, repeat ...

Memory strategies for long-running agents

Memory strategies for long-running agents

Long-running agents accumulate context. The job of ...

Evaluating agents when there's no single right answer

Evaluating agents when there's no single right answer

Evaluating a single prompt is hard. Evaluating an ...

Tool selection: when the model should pick, and when you should

Tool selection: when the model should pick, and when you should

Tool-using agents look powerful in demos because t ...