CAMERON WESTLAND

From Generic Agents to Human-First Sub-Agents

Jul 30, 2025

I learned this the hard way: when someone else pays the cost of understanding your AI-generated content, it isn't real productivity.

I generated a Dagster pipeline plan in 6 minutes. My colleague Sam spent nearly an hour trying to review it. Here's what the AI gave me:

Pipeline Overview: Implement staged rollout with A/B testing framework, targeting 5% initial traffic with automatic failover mechanisms and comprehensive monitoring across three deployment zones...

And here's what Sam actually needed to see:

What we're building: Basic ETL pipeline to process user events
Unknown: How do we want to handle failures? (needs architecture decision)
Next step: Sam, can you review the data schema before I implement?

Claude Code recently added sub-agents. They are specialized AI assistants you can create for specific workflows. But most examples I see are generic role-based agents: `backend-architect`, `frontend-developer`, `payment-integrator`. They try to help with any task in their domain.

But sub-agents become powerful when they solve specific human problems, not when they try to be universal assistants.

The System in Action

I built a three-agent system that works together to solve the hidden cost problem. Here's how they collaborate when I type "ready for PR":

Step 1: pr-creator analyzes the changes

Scans git diff to understand what actually changed
Cross-references with GitHub issues for context
Identifies the core problem being solved

Step 2: strategic-advisor checks alignment

Fetches current roadmap and strategic priorities
Confirms this work aligns with Q4 goals
Flags any misalignment before the PR goes out
Succinctly adds motivation and strategic context so reviewers don't ask "Why are we doing this?"

Step 3: reviewer-first optimizes for human scanning

Formats everything for "human scanning"
Uses clear sections: What, Why, Key Changes- Protects reviewer attention as a precious resource
Flags uncertainties explicitly: 🟡 Open Q: / 🔴 Assumption:

Here's what came out when I used this system on a real bug fix:

This isn't three separate tools. It's one system. The pr-creator doesn't just generate descriptions; it uses strategic-advisor to ensure alignment and reviewer-first to format output that respects your teammates' time.

Why Generic Scope Fails

A `backend-architect` agent could theoretically help with database design, API patterns, performance optimization, deployment strategies—anything "backend-related."

But when an agent can help with "anything backend," it doesn't deeply solve any specific workflow friction.

Think about a Swiss Army knife. It has a blade, scissors, screwdriver, can opener, tweezers—everything you might need. But when you actually need to cut something important, you reach for a real knife. Need to tighten a screw? You grab a proper screwdriver. The Swiss Army knife's flexibility makes it mediocre at each individual task.

Generic scope means generic value.

The same applies to agents. A "coding assistant" that can "help with any development task" will be mediocre at all of them. But an agent that only creates PR descriptions? It can be exceptional at that one thing.

Wait. Isn't `backend-architect` already narrower than a foundational LLM

Yes. And that's exactly the point.

Think of it as a spectrum of specificity:

Foundational LLM: Can write poetry, debug code, plan vacations, analyze data
Role-based agent: Can design databases, optimize APIs, plan deployments
Task-specific agent: Can create PR descriptions that respect reviewer attention

Each step down trades flexibility for excellence at specific problems. The magic happens when you start with real UX pain and work backward to the right level of narrowness.

My team's biggest friction wasn't "we need better backend architecture". It was "we spend an hour trying to understand AI-generated PR descriptions." So I built an agent that solves exactly that problem, exceptionally well.

Building Your Own System

I didn't build all three agents at once. I started with the problem that hurt most—teammates spending an hour trying to understand my AI-generated content.
Look for places in your workflow where:

AI output creates work for teammates (like my Dagster example)
Context gets lost between conversations (strategic priorities, past decisions)
The same information gets recreated repeatedly (PR descriptions, status updates)

Start with one agent. Build it well. Then look for opportunities to compose.

My reviewer-first agent became more powerful when I added strategic-advisor to check alignment. The pr-creator became a system orchestrator when it learned to call both other agents.

Start tomorrow: Pick the most annoying 5-minute task your team does repeatedly. Build an agent that triggers on specific phrases and solves that exact problem. Don't try to build a "coding assistant". Build something that eliminates one recurring frustration.

Once it works, look for the next friction point. Build another focused agent. Then find ways to connect them.

The results? My team now spends 10 minutes reviewing PRs instead of an hour. Strategic decisions reference actual data instead of guesswork. And I stopped creating work for other people with my AI experiments.

Agent composition isn't just more powerful . It's more respectful of human attention.