AI-for-Ops

I made Claude Code my daily driver for infra work

Not a toy. The actual setup I use to run cloud and homelab work through an AI agent every day — what works, what to gate, and where it still bites.

For about six months, the terminal I keep open all day hasn’t been a shell — it’s been an agent. Claude Code went from “neat for one-off scripts” to the thing I drive most of my infrastructure work through. This is the setup, minus the hype.

The rule that makes it work

One principle underneath everything: the agent does the labour, I keep the judgement.

That sounds obvious until you watch someone hand an agent the keys and walk away. The teams getting burned are the ones treating it like a vending machine. The ones getting leverage treat it like a very fast, very literal junior who needs a clear brief and a review gate.

So I never let it touch anything irreversible without a checkpoint. Branch, not main. Plan, then execute. Show me the diff. That friction is the whole point — it’s where my experience gets injected, and it’s what keeps a confident-but-wrong suggestion from becoming an outage.

What it’s genuinely good at

  • Boilerplate infra. Bicep modules, Ansible playbooks, the tenth variation of a deployment script. It writes the draft, I correct the 10% that matters.
  • Reading unfamiliar code fast. “What does this module actually do, and what depends on it?” across a repo I’ve never seen — minutes, not an afternoon.
  • The boring glue. Migrations, test scaffolds, wiring a CLI together. The stuff that’s not hard, just tedious, which is exactly where my attention leaks.

What I gate hard

  • Anything that deletes, migrates, or rewrites history. It proposes; I approve.
  • Anything touching secrets or production. The agent doesn’t get prod credentials, full stop.
  • “Clever” refactors. Nine times out of ten the simple version it skipped past was correct.

The setup, concretely

A CLAUDE.md in every project that encodes the conventions — stack, shell, what not to do — so I’m not re-explaining context every session. Skills for the workflows I repeat. A permission posture that’s permissive on reads and tight on writes. And a habit of asking for a plan before any multi-step change, so I can catch a bad approach before it’s 200 lines deep.

Where it still bites

It will state something wrong with total confidence. It will occasionally “fix” a passing test by deleting the assertion. It does not know what it doesn’t know. None of that is a dealbreaker — it’s just the reason the human gate is non-negotiable, not optional.

The honest summary: it didn’t replace me, it deleted my worst hours. The tedious 40% that used to eat the day is mostly gone, and what’s left is the part that actually needed a brain.

That trade is the whole game.

#claude-code#automation#workflow
// subscribe

Get the lab notes

New teardowns and tools, roughly weekly. No spam, no fluff — unsubscribe anytime.

// newsletter endpoint not yet configured