Notes

My Agentic Engineering Stack

How I actually build with AI agents — the tools, the workflow, and why communication precision matters more than code.

The Shift

Most people think agentic engineering is about letting AI build your app. That’s not the shift. The real shift is learning how to think with the agent.I use “agentic engineering” to mean the full loop: planning, communicating intent, executing with AI agents, verifying output. Not just “prompting better.”

The bottleneck is no longer code. It’s communication precision. How clearly you can express intent, constraints, taste, sequencing. The agent builds exactly what your thinking allows. Nothing more. Nothing less.

This is the stack I actually use. Not theoretical. This is how I ship.


Integration Layer — Agent Browser

I break the work into three layers: integration, taste, and backend logic. Each one has a different tool because each one demands a different kind of precision.

For frontend integration, I use Agent Browser from Vercel Labs.Agent Browser lets the agent crawl the DOM directly. It sees what the user sees — rendered HTML, hydrated React components, visible data. Think of it as giving your agent eyes on the actual page. It lets the agent verify whether the API integrated correctly, whether the frontend rendered data as expected, whether the right output is showing up in the UI. For static outputs, pure React, clean HTML — it just works.

The limitation is taste. Agents can verify that data rendered. They can’t tell you whether the animation feels right, whether the spacing breathes, whether the interaction has the right weight. That’s still a human judgment call.This is the gap most “AI replaces developers” takes miss entirely. Verification of correctness is solvable. Verification of taste is not.


Taste Layer — agentation.dev

This is where agentation.dev comes in.agentation.dev allows you to annotate taste directly into the agent workflow — motion expectations, UI feel, visual hierarchy, interaction patterns. Think of it as teaching the agent your aesthetic instincts. It lets you annotate design intent directly into the agent workflow. Motion expectations. Visual hierarchy. Interaction patterns. You’re essentially encoding your aesthetic instincts into something the agent can reason about.

And because of MCP integrations, the feedback loop into Claude Code is fast. Annotate, run, see the result, adjust. The cycle time between “this doesn’t feel right” and “try this instead” compresses dramatically.


The Core Engine — Claude Code

When I say agentic engineering, I mostly mean Claude Code workflows. That’s the environment everything plugs into.Claude Code is where the actual engineering conversation happens. Agent Browser and agentation feed into it. GSD planning specs get consumed by it. It’s the orchestration point.

The key advantage isn’t code generation. It’s that you can treat engineering like a conversation about systems. You’re not writing code. You’re describing architecture. And the agent executes.

This is where the frontend tools — Agent Browser for integration verification, agentation for taste — feed their output back into. And it’s where the backend workflow begins.


Backend Planning — GSD

For backend work, I don’t rush. I get obsessively granular.

I use GSD to generate planning specs, and I go as fine-grained as humanly possible.GSD produces a planning folder with context files for every single phase in the product plan. A typical backend milestone might have 8–12 phases, each with its own detailed context, execution instructions, and validation criteria. A typical backend milestone might be 8 to 12 phases. Detailed context files. Planning documents for every single phase. Execution instructions that leave no room for ambiguity.

Sometimes I spend 48 hours just on the planning. Not coding. Planning.

Why? Because once the agent understands the system clearly, execution becomes trivial. The planning folder becomes the agent’s mental model. Everything it needs to just build.


Product Context — Robin

But the planning has to come from somewhere. That’s where Robin fits in.Robin works as a second brain and PM. I keep threads for client meetings, engineering notes, feature ideas, product vision, architecture decisions. When I say “warm start,” I mean the agent begins with the full product mental model instead of starting from a blank prompt.

Robin is my second brain. It aggregates client meetings, engineering notes, feature ideas, product vision, architecture decisions — all organized into threads. When I feed that context into the agentic workflow, the agent doesn’t start cold. It starts with the full mental model of the product.

I call this a warm start for agentic engineering. The agent isn’t guessing at requirements. It already has the conversation history, the constraints, the decisions that were already made. That’s the difference between generating code and building a system.


Execution Principles

Three rules I don’t break.

Sequential, not parallel. I run phases one at a time. Phase 1, then Phase 2, then Phase 3. Each phase updates system state and context before the next one begins. People who try to parallelize agent execution wonder why the output is inconsistent. This is why.It’s slower, yes. But the reliability is dramatically higher. Agents that execute in parallel can’t account for state changes from earlier phases. Sequential runs compound context.

Extreme granularity. I choose the library. I choose the functions. I choose the method execution order. Batch vs sequential jobs. Validation logic. Testing strategy. The agent writes the code, but the architecture decisions come from the conversation. Every single one.This might sound like micromanagement, but it’s the opposite. The more precise the spec, the more autonomous the agent becomes during execution. Ambiguity is what causes agents to hallucinate architecture.

Gray area exploration. Before execution, I always tell the agent: explore the gray areas in this plan and ask me questions. Then I wait. The agent probes edge cases, architecture gaps, missing assumptions, validation failures. We go back and forth until no gray areas remain. Only then does research move to execution.


The Stack, Summarized

Agent Browser for integration verification. Agentation for taste. Claude Code as the engine. GSD for backend planning. Robin for product context. Sequential execution. Gray area exploration before every run.

The whole thing is lean. And I open source my Claude files so I can spin up new machines and have this process running without hiccups.

The real lesson isn’t about the tools. It’s about thinking more clearly than the machine. The better your system thinking, the better your agents perform. The best agentic engineers I know spend far more time planning than coding. Because once the system is clear, the agents just build.

6 References