What AI Agents Should Never Do on Their Own

Contents

What the agent should never touch alone The categories that should always require a human AGENTS.md: write the contract The two-agent loop The final report The unglamorous work

focuses on what they can do.

Autonomy gets framed as the goal: give them tools, give them access, let them run.

The more freedom, the better the output.

That framing is mostly accurate. I use agents daily. They’ve genuinely increased my output. I’m a believer!

And I’ve also lost two hours of work through an agent that was doing exactly what I asked.

I was working on a feature branch cleanup.

The task description said “remove unused files and clean up the repo.” The agent interpreted “unused” broadly, deleted a config directory I hadn’t touched in months but still referenced from the deploy script, and kept going.

I caught it during the diff review. The config wasn’t in version control. Two hours reconstructing it from memory and git history.

The task was clear and the agent followed instructions, the only problem was that nothing told it where to stop.

Knowing which tasks to gate is part of running agents well. Give them full freedom on the wrong category and you’ll spend the afternoon undoing what took them thirty seconds.

Hey there! My name is Sara Nóbrega and I teach you how to become an AI power user on Learn AI. Free to subscribe!

What the agent should never touch alone

Some tasks are reversible. For example, a refactored function can be reverted or a new unit test can be removed. The cost of a mistake is low.

Recovery cost varies by task. A refactored function takes seconds to revert; you just revert the commit, but a dropped production table might take your entire week, if recovery is even possible.

The question before you run a task: can this be undone?

If yes, let the agent move. If no, add a checkpoint before it runs.

Here’s the permission matrix I work from:

Table showing recommended agent autonomy levels and human review requirements by task type. Small refactors and unit tests can have high agent autonomy, while API changes, dependencies, migrations, security, infrastructure, and production deployment require increasing levels of human review. Image by Author and ChatGPT.

The categories that should always require a human

Some categories require a human checkpoint regardless of how well-specified the task is.

The risk of a mistake is too high, and the recovery cost too steep, to let an agent decide on its own.

What AI Agents should not tackle alone, part 1. Image generated with DALL-E. — Image by Author and ChatGPT.

Destructive file operations

`rm -rf`, `git clean -fd`, `git reset --hard`.

These delete or discard work that may not be recoverable.

An agent will run them if the task description implies cleanup.

I’ve had one run `git clean -fd` in the middle of a refactor because the task said “clean up temporary files.”

My uncommitted work was gone. There was no malfunction, as the agent did exactly what the words said. The safeguard is an explicit block list with a confirmation step, not trusting the agent to infer where “clean up” ends.

2. Database writes and migrations

Any DELETE without a WHERE clause, any DROP or TRUNCATE, any schema migration touching production data.

A typo in a WHERE clause can wipe a table. A migration that runs out of order can corrupt data that’s impossible to reconstruct. Always review before running.

3. Cloud infrastructure

`terraform apply`, `kubectl delete`, `aws iam *`, `gcloud iam *`.

Infrastructure changes affect live systems and often other teams. Permissions changes are especially dangerous because the damage can be invisible until something fails.

What AI Agents should not tackle alone, part 2. Image generated with DALL-E. — Image by Author and ChatGPT.

4. Production deployments

Any deployment to a production environment should go through a human review step, even if the code was agent-generated.

CI/CD pipelines can run agent output automatically, and that’s fine. The decision to deploy to production is yours.

You know what’s in flight, what incidents are open, what maintenance is scheduled. The agent doesn’t have any of that context, and it can’t ask for it mid-pipeline.

5. Auth and security logic

Authentication flows, authorization rules, token handling, session management.

Bugs here don’t show up in unit tests, they show up in incident reports, sometimes months later.

An agent writing auth logic will produce something that looks correct and passes the happy path.

The dangerous cases are the edge conditions: a token that doesn’t expire under a specific sequence of API calls, a route that bypasses middleware when a parameter is missing.

Those are exactly what unit tests miss and what security review catches. Every auth change needs a human who’s specifically looking for those gaps, not one who’s satisfied the happy path is covered.

6. Secrets, `.env`files, API keys

An agent reading or writing credentials creates exposure risk. Keep this category off-limits by default and handle it manually.

git push --force sits in its own category because it rewrites history on the remote. Once pushed, other contributors’ local branches diverge. Recovery is painful and sometimes impossible.

Humans should be careful with all of these commands too. Agents just make them easier to trigger by accident, buried inside a longer sequence of otherwise safe steps.

AGENTS.md: write the contract

Give agents specific structure from the start. An AGENTS.md file at the root of your repo tells the agent what the project is, how to run it, and what it’s not allowed to touch without asking.

A vague<strong> </strong>AGENTS.md gets you an agent filling gaps with guesses. I learned this on a codebase that had no AGENTS.md at all.

The task was “organize the project structure.” The agent moved files across directories based on naming conventions that made sense to it. Everything that referenced those paths broke.

The task took the agent twenty minutes; the cleanup took me two hours. Three lines of scope constraints would have prevented it entirely.

Here’s the template I use:

# AGENTS.md

## Project

[Brief description of the project and tech stack]

## Setup

\`\`\`bash

# Install

npm install  # or pip install -r requirements.txt

# Run

npm run dev

# Test

npm test

# Lint

npm run lint

\`\`\`

## Coding rules

- Make minimal changes. Don't refactor unrelated code.

- If behavior changes, add or update tests.

- Don't touch files outside the scope of the task.

- Keep diffs readable. One concern per commit.

## Safety rules

Ask before running any command in blocked_commands.md.

If you're unsure whether a command is safe, stop and ask.

## Definition of done

- Tests pass

- Diff is explainable in one sentence

- Final report provided (see below)

## Final report format

After every task, provide:

1. Summary of changes

2. Files changed

3. Tests run and result

4. Risks or assumptions

5. Anything not completed

```

The companion file, blocked_commands.md, lists exactly what needs human approval before running:

# blocked_commands.md

## Destructive file operations

- rm -rf

- git clean -fd

- git reset --hard

## Git operations

- git push --force

- git push --force-with-lease

## Database operations

- DROP TABLE

- TRUNCATE TABLE

- DELETE without WHERE clause

- Any migration that alters a production schema

## Cloud / infrastructure

- terraform apply

- kubectl delete

- aws iam *

- gcloud iam *

## Secrets

- Any command reading or writing .env files

- Any command touching API keys or credentials

When the AGENTS.md is vague, the agent guesses. When it’s specific, the agent executes, and so the file is your contract. Write it before you start the task, not after something breaks.

Check my two latest articles where you can learn how to give your AI unlimited context and explore six common hard decisions AI Engineers need to make in production.

The two-agent loop

For anything medium-complexity or above, don’t use one agent, use two.

Agent 1 implements. Agent 2 reviews. Then Agent 1 applies only the critical feedback.

Implementer prompt:

You are a senior software engineer implementing a specific task.

Task: [describe the task]

Context: [link to AGENTS.md or paste relevant sections]

Rules:

- Make minimal changes.

- Stay in scope.

- Don't refactor unrelated code.

- Add tests if behavior changes.

- When done, provide a final report: summary, files changed,

  tests run, risks, anything incomplete.

Reviewer prompt:

You are a code reviewer with no attachment to the implementation.

Review this diff: [paste diff]

Check for:

- Bugs and edge cases

- Missing tests

- Security issues

- Unintended behavior changes

- Anything outside the stated scope

Output:

- Critical issues (must fix)

- Minor issues (optional)

- Anything you'd flag for a human

Do not rewrite the code. Flag, don't fix.

The reviewer agent has no ego investment in the code. It looks for bugs, edge cases, test coverage, and security issues without trying to redo the work.

Code review is how you catch what you missed. The two-agent loop is the same process, automated.

The final report

Require a final report for every agent task:

1. Summary of changes

2. Files changed

3. Tests run and result

4. Risks or assumptions

5. Anything not completed

This makes the agent accountable. If it can’t summarize what it did in clear terms, that’s a signal the task wasn’t clean.

It also builds up documentation without you writing it manually. The reports stack. When something breaks a week later, you can trace back exactly what changed and why.

The unglamorous work

The hype around AI agents is here to stay, and mostly earned. They do increase your output.

The practitioners getting the most from them are the ones who did the setup work: wrote the AGENTS.md, thought through the permission levels, built the blocked commands list, set up the two-agent loop.

Agents work well when they have clear instructions. That part is on you.

Thanks for reading!

You can find me on LinkedIn and Substack, where I share more details regarding AI and LLM.