AI that drafts is an assistant. AI that can take actions is automation, meaning it can change systems, not just suggest.
Send. Update. Approve. Schedule. That is where an error stops being a bad answer and becomes a real change in the world.
Automation has a predictable weakness: it repeats. A wrong decision once is noise. The same wrong decision applied to 5,000 tickets becomes an incident.
“Good enough” is rarely a single accuracy number. In safety-, financial-, or reputation-sensitive workflows, what matters is consequence. EngX makes that point plainly in Exploring the minds of machines: from hand written digits to thinking in language. A 97% success rate still implies a 3% error rate, and whether that is acceptable depends on what happens when it is wrong.
Action-taking AI forces the same test. Not only “is it usually right?” but “how expensive or harmful is wrong, and how quickly can wrong repeat?”
I work through it in this order:
-
decide what the system is allowed to do,
-
limit how much it can change and how quickly it can act,
-
add checks and human judgment where it matters, and
-
make recovery routine rather than heroic.
Advice versus action
Before talking about controls, draw one line. Advice is easy to review after the fact. Actions are harder to unwind.
The same model can be fine for drafting and unacceptable for acting. Separate “advice” from “action” explicitly.
| Advice (e.g., lower consequence) | Action (e.g., real consequence) |
|---|---|
| summarise | send |
| draft | approve |
| recommend | update |
| trigger | |
| buy | |
| delete | |
| deploy |
These are examples, not a complete list.
“Draft an email” and “send an email” are different risks. So are “suggest a change” and “apply a change.”
A simple ladder makes the boundary operational:
-
Suggest
-
Draft
-
Route (prepare the change request and send it to the right place for approval, either a person or a workflow step, without executing it)
-
Execute
As you move up the ladder, you add permissions, limits, checks, human approval, and recovery. “Execute” is where small mistakes stop being small.
Security teams describe the same failure mode in their own language: giving a system too much freedom to take actions without enough constraint. The Open Worldwide Application Security Project Top 10 for Large Language Model Applications calls this risk “Excessive Agency.”
Four non-negotiables
If AI is allowed to take actions, start with four basics before anything is scaled up:
Limit access. Limit scope (how much it can change). Limit speed. And have a safe mode.
If those four are missing, the rest tends to be reassurance rather than control.
Seven safety switches
1) Minimum access
Start with authority: only give AI the access it needs. If it drafts, it should not be able to send. If it reads, it should not be able to write.
Access defines the worst plausible outcome. Minimum access turns a serious mistake into a contained one.
Security teams call this least privilege. The US National Institute of Standards and Technology definition of least privilege is concise and applies equally to processes acting on behalf of users.
2) Limit the size of a change
Once authority is bounded, limit the impact. Limit how much it can change in one run: one record, not thousands; one asset, not a fleet.
Bulk errors are where most damage lives. A common pattern is a bulk update that quietly applies the wrong rule, or an outbound message sent to the wrong segment. The first few actions look fine. The thousandth is where the problem becomes visible.
Small batches are how you stay in control long enough to learn.
3) Limit the speed of change
Size limits contain impact. Speed limits contain time. Put caps on how quickly it can act: actions per minute/day, spend caps, and batch-size limits (how many items per run), tightened for higher-impact actions.
Speed limits create time to notice and stop a problem before it becomes a cascade. The harder an action is to reverse (payments, deletions, customer communications), the more you want friction.
4) Human approval for high-impact steps
After you’ve bounded authority, size, and speed, decide where human judgment must remain in the loop.
Use human review where it changes the outcome: new action types, high-value actions, regulated or reputational areas, and large-batch effects (payments, bulk messages, deletions, configuration changes).
Human review does not scale to every action. If everything requires approval, approvals become routine, and routine approvals become a risk in their own right.
Make review fast by providing a short packet: proposed action, key inputs, reason, what happens next, and how to undo it.
5) Basic checks before acting
Human approval is not a substitute for basic correctness checks. Add basic checks that block obvious bad actions: completeness, ranges, duplicates, invalid targets, and hard-rule violations.
Examples:
-
setpoints: unit and range checks
-
outbound messages: block “send to all” without explicit approval
-
purchasing: vendor allow lists and threshold approvals
A large share of failures are preventable at this level of discipline.
6) Clear records of actions
Checks and approvals only help if you can reconstruct what happened. Keep records a human can use: what happened, who approved it (a person or a pre-set rule under a threshold), what checks ran, what inputs mattered, and what happened next.
This is an audit-friendly record (easy to review later): what happened, why it happened, and who approved it.
If you cannot explain why an action happened, you will struggle to contain it and stop repeats.
7) Safe mode, and practise it
All of the above reduces the chance of failure. Safe mode reduces the cost of failure.
Define safe mode in advance and rehearse it. A practical default is disabling actions while keeping low-risk assistance (pause sending and updates, keep summarising and drafting) and handing off to humans.
Decide who can trigger safe mode, and practise it like an incident drill.
In software resilience this maps closely to the circuit breaker pattern described by Martin Fowler: stop calling a failing path before it drags everything else down.
Five questions before enabling actions
If you can answer these cleanly, you are usually in a controllable place:
-
What actions can the AI take, in plain language?
-
With its current access, what is the worst plausible outcome?
-
How wide and how fast can a mistake spread?
-
Which steps require a human decision, and what does that person see?
-
If it starts behaving badly, how do you stop it and recover in minutes?
If those five questions are hard to answer, the system is moving faster than the controls around it.
Closing
Action-taking AI is automation with a conversational interface. The interface is new. The engineering discipline is not.
Keep mistakes small. Slow down repetition. Make recovery routine.
Before you allow AI to take actions in your environment, which two switches are non-negotiable, and what would you add for your domain (for many teams: minimum access and safe mode)?
For a structured risk lens that aligns with this approach, the NIST AI Risk Management Framework is a solid reference.