Coding Agents Made Me Take Specs Seriously

Everything changed very quickly.

In a matter of days, my work at DataCamp went from coding and debugging mostly by hand to doing a lot of it through Claude Code. Those long sessions where implementation felt manual, almost artistic, did not disappear completely, but they changed.

That shift was gradual, then suddenly obvious.

With Leo's arrival, a lot of things changed in my personal life too. If you do not know what this is about, take a look at the first post in the series. Sudden context switching has become a skill I had to develop by force. With a baby at home, uninterrupted work sessions do not exist. There is always a diaper change, a bottle, a bath, or some play time in the middle.

And energy, of course, is not always the same.

Luckily, Leo sleeps pretty well, but just like during the day, the actual usable time is fragmented. I can no longer count on having several hours in a row to get into a problem, understand it, implement it, test it, and close it. Now I need the work to survive constant interruptions.

That is where specs stopped feeling like a formality and became a survival tool.

The useful part, if you do not care about my exact setup, is this: agents get much better when the work has already been turned into something testable. Not a perfect document. Not a corporate requirements process. Just enough shape that the agent does not have to guess the product, the constraints, and the finish line at the same time.

Using the moment

This is where agents have allowed me to keep working on my side projects with some continuity. Over time, what started as a collection of prompts and personal habits became Harness: my own development workflow packaged as skills, documents, hooks, subagents, and small guardrails.

It did not start as a product. It started because simply sitting in front of the computer and writing code was no longer enough.

In practice, Harness is a collection of Claude Code skills: slash commands that know which files to read, what questions to ask, and what documents to produce. Each one can run without you sitting there; you come back to the output, review it, and feed the next step. The repo is public if you want to look at how it is structured.

Everything starts with an idea. It usually comes from one of my own needs, either professional, some tool I miss as a developer, or personal, some tool that could help me in my day-to-day life.

At first, this step was manual, with some Claude Code help to settle the idea. Now the flow starts with /ideate, which researches competitors, pain signals, and viability before I fall too much in love with it. If it still makes sense, /product-plan turns the vision into audience, positioning, roadmap, UX direction, and risks.

The idea is simple: instead of jumping from a vague intuition straight into implementation, Harness forces me to explain what I want to build, who it is for, why it makes sense, and what should stay out.

The output lives in .harness/product/: idea.md, product.md, roadmap.md, competitors.md, and a CONTEXT.md with the domain vocabulary. That last file sounds small, but it matters. If a concept has a name in the product, I want the agent to use that name in specs, code, and docs.

The next step is /dev-plan. Product specs become architecture, stack decisions, an implementation plan, ADRs, and one feature spec per must-have feature. Each feature file includes the goal, scope, technical approach, edge cases, and acceptance criteria.

So, before writing a single line of code, the project already has a clear shape: what we are building, who it is for, which decisions constrain the work, and how we will know when a feature is done. It is not always perfect, but it is good enough for an agent to start with context.

The difference is small but important. As a very simplified example:

Build the export flow and make sure it works well.

And this is the kind of spec that gives the agent a chance:

Build the export flow for SP-404 packs.

Constraints:
- A pack has at most 16 samples.
- Files must use the hardware profile naming convention.
- Export should not mutate the source samples.

Acceptance criteria:
- I can choose a hardware profile before exporting.
- The exported folder contains the expected file names.
- Invalid packs show a clear error before any files are written.

It is still not a huge document. But now there is something to verify, something to reject, and something to review.

This usually takes me one or two naps. I would rather settle the concepts properly in a long conversation than change the strategy halfway through implementation. When time is so fragmented, improvising gets expensive.

Once I have those documents, the next step begins: implementation.

Learning to delegate

For implementation, I have tried several strategies and skills. The ones that stuck are the ones now built into Harness.

/implement reads the specs, classifies which features can be done AFK and which ones need human approval, and implements the current phase as vertical slices. That part matters: it does not send one agent to do "the frontend" and another to do "the backend." Each agent implements one feature end to end, from the user-facing entry point down to whatever data layer it needs.

/qa comes after that and tests those features against their acceptance criteria. It does not stop at running the test suite. For an API it can use curl, for a CLI it can run commands with fixtures, for a web app it can use Playwright. The point is to verify visible behavior.

There are also continuity skills: /update-docs syncs docs with the real state of the code, /next-step inspects the repo and recommends what should happen next, /handoff leaves a summary so the next session does not start blind, and /task handles small changes on shipped products.

Worktrees are still the main piece for parallelization. Several agents can work on the same repo, on the same machine, in different branches, without stepping on each other.

Magic.

So, once the tools are on the table, the only thing left is to put the agents to work. I choose two projects and two roadmap items, open a terminal session and a worktree for each one, and start them before Leo wakes up.

The important part here is not having many tools. The important part is that each tool receives something clearer than a loose prompt written in a hurry. It gets context, goals, constraints, acceptance criteria, and a reasonable idea of when to stop.

That, for me, is the big difference between asking an agent for something and working with specs.

You can copy the shape without copying Harness: write down the problem, the constraints, the user flow, the non-goals, the acceptance criteria, and the weird edge cases you already know about. That alone removes a lot of the ambiguity that agents otherwise fill with confidence.

The afternoon nap

During the first nap of the afternoon, it is time to review the code generated by the agents.

I run the code review in each worktree. If it finds something, I review it manually, test it, and if there is something I do not like, I guide the agent to fix it. When everything is in order, I merge directly into main, no PRs or anything, full YOLO style. They are personal projects where I am the only one working, after all.

That does not mean there is no review. I am more and more convinced that review is the central part of the work. It just does not always look like opening a PR and waiting for checks. In these projects, review is me testing the flow, reading the diff, spotting weird assumptions, and deciding whether the result still matches the original intent.

Finally, when everything is integrated, I run /update-docs so the project documents, the README, and the Harness specs do not drift away from the code.

One of the fastest ways to lose control of a project with agents is to let documentation fall behind. If the next session starts with old context, the agent will work on a reality that no longer exists.

And after that? Well, if it has not happened already, there is probably a diaper change waiting for me, some play time, some drool and babbling... honestly, I am completely in love with this baby.

It does not feel like programming

This new way of approaching software engineering does not feel like programming. And that is not necessarily bad.

Code has become cheaper, faster, and more accessible, but ideas, taste, instinct, and high-level knowledge are, from my point of view, the most valuable things right now. It is very easy to generate a lot of mediocre code quickly if you are not clear about what you are trying to build.

Working this way has allowed me to move forward in a way I could not have before, bring back projects that were practically dead, and at the same time enjoy time with my little one. Of course, it does not always go well. I have had bad experiences because of vague specs, hidden context, and incorrect implementations that had to be rolled back.

But that is where I have learned the most.

Taking specs seriously has not been an aesthetic decision or a trend. It has been a way to move one level up, wear product and architecture hats, and build something solid even when I only have one nap ahead of me.