Every AI Coding Tool: A Solo Developer's 14-Month Odyssey (Part 1)
The Commit Log Doesn’t Lie
I recently ran git shortlog -sn on my platform repo and stared at the output for longer than I should have:
3667 Filipe Estacio
13 Cursor Agent
6 OpenCode Assistant
2 tessl-app[bot]
3,667 commits over 14 months. One developer. And a rotating cast of AI pair-programming partners, each arriving with ceremony and departing without a goodbye.
This is the story of trying every AI coding tool I could get my hands on while building a SaaS platform from scratch. Not a review. Not a comparison chart. Just an honest account of what happened when a solo developer treated AI tools like dating apps - always convinced the next one would be The One.
Part 1 covers the early era, before Claude Code changed everything. Part 2 will cover the Claude Code adoption and where things stand today.
The Before Times (March - July 2025)
The project started on March 11, 2025, with the most optimistic commit message in software engineering: “Initial commit from Create Next App.”
For the first five months, it was just me and a basic copilot autocomplete. The commit messages from this period tell their own story:
add cdk backend
added middy to functions
cleanup
progress in solving DNS resolver bug
Terse. Human. Occasionally desperate. No AI agents, no specification systems, no structured workflows. Just a developer, a monorepo, and a growing suspicion that there had to be a better way to move faster.
By August, the platform had grown into a proper CDK-backed multi-stack architecture with DynamoDB, API Gateway, WAF, and a Lambda factory pattern. I was building faster than I could maintain context. Every time I switched from infrastructure to frontend to API design, I’d lose 20 minutes just remembering where I left off.
I needed a pair programmer. A tireless one. One that wouldn’t judge my variable names.
Cursor AI (August 2025)
Cursor arrived like a revelation. On August 26, 2025, I committed what felt like a turning point:
feat(infrastructure): complete CDK multi-stack Phase 1 and add Cursor rules system
I created a .cursor directory with comprehensive rules for infrastructure development. The rules file was detailed, opinionated, and - in hindsight - adorably naive. I was essentially writing a personality profile for my new AI colleague: here’s how we name things, here’s how we structure CDK constructs, here’s our testing philosophy.
The branch names from this era are museum pieces. Cursor auto-generated them with the model name embedded:
cursor/refactor-github-actions-workflow-for-readability-claude-4.5-sonnet-thinking-a8df
cursor/separate-user-portal-unit-tests-to-vitest-project-claude-4.5-sonnet-thinking-6cad
cursor/fix-marketing-app-linting-issues-claude-4.5-sonnet-thinking-60ad
Imagine explaining those to a future code archaeologist. “Yes, the branch name includes the AI model that wrote the code, and yes, we thought that was useful at the time.”
On November 29, Cursor Agent went on a spree - 13 commits in a single day, all refactoring CI workflows and test configurations. The commit messages were fluent and descriptive, a stark contrast to my own terse style:
feat: Implement PR checks workflow and add testing
Refactor: Modularize NX affected pipelines workflow
fix: Correct GitHub Actions conditional expressions in PR checks workflow
docs: Add root cause analysis for PR checks workflow fix
fix: Add extensive debugging for NX JSON parsing issues
“Add extensive debugging for NX JSON parsing issues.” The agent was doing what I do at 2 AM, except it documented the process. That was genuinely useful. But it was also my first taste of a pattern I’d see repeatedly: the AI could produce a lot of work very quickly, but the review burden shifted entirely to me. Thirteen commits touching CI workflows meant thirteen diffs I needed to verify before merging.
What Cursor taught me: AI agents can absolutely do substantial refactoring work. But an agent that makes 13 commits in a day needs a human who can review 13 commits in a day. The bottleneck wasn’t code generation - it was code comprehension.
Kiro (September 2025)
While Cursor was handling the broad strokes, I discovered Kiro - AWS’s AI coding agent. I was building Herald, an internal changelog system backed by CodeBuild and CodeStar connections, and Kiro seemed like a natural fit for AWS-heavy work.
The evidence in the git log is slim - a single commit on September 3:
update kiro tasklist
That’s it. One commit. Kiro used a task-list approach to development: you’d define what you wanted, and it would work through the list methodically. The idea was sound, but in practice, it felt like managing a junior developer’s todo list. I was spending more time writing task descriptions than I would have spent writing the code.
Herald itself got built - the pipeline, the GitHub integration, the content schemas. But by mid-September, there’s no more trace of Kiro in the commit history. Herald was eventually removed entirely in November (“Remove Herald/Changelog and clean up GitHub connection from staging account”). So both the tool and the project it helped build are gone.
What Kiro taught me: Task-list driven development only works when the tasks are well-defined. For exploratory work - building something you’ve never built before - the overhead of specifying tasks precisely enough for an AI to execute is often higher than just doing it yourself.
Warp AI and the Birth of AGENTS.md (September - October 2025)
Around the same time, I started using Warp terminal’s AI features. On September 8:
feat: add agent configuration and enhance email DNS setup
This was the creation of WARP.md - a configuration file that told Warp’s AI agent about the project structure, conventions, and constraints. It was my first attempt at writing a “how to work on this codebase” document aimed at an AI reader rather than a human one.
But then something more interesting happened. On September 20:
refactor: replace direct communication with AI-mediated question system
This was a real product feature, not a developer workflow thing. Our platform is a marketplace, and we needed a way to mediate communication between parties. Instead of building a traditional messaging system, we built an AI-mediated question flow where requests between parties went through an AI layer that could structure, validate, and route them intelligently.
Looking back, this was one of the first times the AI tooling influenced the product itself. Building with AI every day made AI-mediated features feel natural rather than exotic.
On October 19, the WARP.md file got a significant upgrade:
refactor(project): Standardize agent workflow and rename WARP.md to AGENTS.md
This was a pivotal moment, even if I didn’t recognise it at the time. By renaming from WARP.md (tool-specific) to AGENTS.md (tool-agnostic), I was acknowledging something I hadn’t consciously decided yet: I was going to keep switching tools. The configuration shouldn’t be married to any single one.
What Warp taught me: The real value wasn’t the AI features - it was the act of writing down “here’s how this codebase works” in a machine-readable way. AGENTS.md would survive every tool change that followed. The tools were temporary; the documentation was permanent.
OpenSpec (October - November 2025)
By mid-October, I’d grown frustrated with the ad-hoc nature of AI-assisted development. Tasks would start without clear specifications, meander through half-formed ideas, and produce code that solved yesterday’s version of the problem.
Enter OpenSpec - a task-driven specification system. On October 17:
move to openspec
OpenSpec introduced a formal structure: you’d write a specification first (SPEC-1-1, SPEC-2-3, etc.), then implement against it. The commit messages from this period show the system in action:
feat: complete SPEC-1-1 Backend API Development for global user management
task: create openspec to fix react query periodic requests
There’s something satisfying about this approach. Write the spec. Implement the spec. Check the spec items off. It imposed discipline on a process that had been chaotic.
But discipline has a cost. In early November, I can see the overhead creeping in:
fix: improve token management and refresh error handling
- Update openspec tasks to mark Phase 3 completion
The fix itself was a few lines of token validation logic. But the commit message is half-implementation, half-bookkeeping. When you’re a solo developer, every minute spent updating task trackers is a minute not spent shipping features. By November 13, the verdict was in:
refactor: remove deprecated OpenSpec and old agent support
Six weeks. Installed in October, removed in November.
What OpenSpec taught me: Specifications are valuable. Specification systems that require you to maintain parallel tracking infrastructure are not - at least not when you’re a team of one. The specification discipline stayed with me; the tooling didn’t.
OpenCode (November 2025)
Fresh off removing OpenSpec, I immediately installed something new. On November 19:
feat: integrate Backlog.md with OpenCode configuration
OpenCode was a terminal-based AI coding tool that worked with a backlog-driven workflow. It had its own email identity (opencode@...), its own configuration directory, and its own way of doing things.
The six commits from OpenCode Assistant tell a focused story - all from November 9-10, all about enum standardisation:
fix: standardize infrastructure enums to lowercase format
fix: restore vitest-dynamodb-lite-config.ts and update test mocks
feat: complete Tasks 2.3-2.4 - standardize enum usage in Lambda handlers
feat: standardize account and user status enums across frontend components
feat: complete enum standardization for filters and Storybook stories
feat: complete comprehensive enum standardization across monorepo
Six commits across two days, systematically standardising enums across the entire monorepo. This was OpenCode’s specialty: mechanical, codebase-wide refactoring with a clear specification. Give it a well-defined task and it would methodically work through every file.
But by January 31, 2026, the traces disappear:
Remove .opencode and backlog folders (moved to workspace repo)
Add .opencode and backlog to .gitignore (moved to workspace)
Moved, then ignored. The pattern was becoming familiar.
What OpenCode taught me: Terminal-based AI tools have a lower context-switching cost than IDE-based ones. I didn’t have to leave my workflow to use it. That insight would prove important later.
The BMAD Method (November 2025 - January 2026)
And now we arrive at the most dramatic example of the install-remove-install-remove cycle. The BMAD Method - a structured approach to sprint planning with Epics, Stories, and a formal workflow system.
The timeline speaks for itself:
| Date | Event |
|---|---|
| Nov 27, 2025 | install bmad method |
| Nov 27, 2025 | Add backlog tasks for Epic 1-3 and update BMAD/OpenCode configs |
| Dec 7, 2025 | cleanup: remove BMAD workflow system and update test configurations |
| Jan 6, 2026 | Merge pull request from bmad-install |
| Jan 21, 2026 | remove BMAD Method |
Installed November 27. Removed December 7. Ten days.
Reinstalled January 6. Removed January 21. Fifteen days.
I can actually trace what happened during the second BMAD era. The commit messages from January are full of epic/story references:
feat(epic-5.1): implement impersonation token + context middleware
feat(epic-5.2): implement Act-As UI and integrate backend impersonation
chore(epic-5): mark 5-1 done and set 5-2 in-progress in sprint-status.yaml
docs: complete Epic 4 retrospective and plan Epic 5
There’s a sprint-status.yaml being maintained. There are retrospectives. For a solo developer. I was running sprints, retrospectives, and formal epic planning for a team of one.
On January 21, the second removal happened, and this time it stuck. No third installation. The BMAD Method taught its final lesson.
What BMAD taught me: Solo developers don’t need sprint ceremonies. The overhead of maintaining epic hierarchies, sprint statuses, and retrospective documents is justified when you need to coordinate across a team. When the team is just you, all that coordination overhead is pure waste. I didn’t need a sprint retrospective to know what went wrong - I was there for all of it.
The Pattern
Looking at this timeline from the other side, a clear pattern emerges:
- Encounter new tool (excitement, setup, configuration)
- Productive honeymoon (the tool handles its sweet spot well)
- Overhead creeps in (maintenance burden exceeds the productivity gain)
- Remove tool (usually within 4-8 weeks)
- Keep the lesson (the useful concept survives the tool)
Each tool left behind something valuable:
- Cursor proved that AI agents could do real refactoring work
- Kiro showed that task-list approaches need well-defined tasks
- Warp produced AGENTS.md, which would outlive every tool that followed
- OpenSpec taught me to specify before implementing
- OpenCode demonstrated the value of terminal-native AI workflows
- BMAD confirmed that process overhead must scale with team size
But the git log also reveals something less comfortable: I spent a non-trivial amount of my solo development time installing, configuring, learning, and removing tools. Every .cursor/rules file, every WARP.md, every openspec task definition, every sprint-status.yaml was time not spent shipping features.
The question I kept circling back to was this: is there a tool that provides the productivity gain without the configuration overhead? One that reads AGENTS.md and just… works?
In Part 2, I’ll cover what happened when Claude Code entered the picture - and why, for the first time, the install-remove cycle stopped.
Every AI Coding Tool I Tried (And Why I Kept Ditching Them): Part 1
The Numbers Don’t Lie
After 14 months of building our platform alone, I ran a quick command to count who had actually written the code. The result: 3,667 commits from me, 13 from a tool called Cursor, and 6 from something called OpenCode.
So yes, mostly me. But those smaller numbers hide a much messier story — a story of installing, configuring, falling in love with, and then quietly deleting roughly a tool a month throughout the entire project.
This is part 1 of that story, covering the early era before I finally found something that stuck.
The First Five Months: Just Me
The project kicked off in March 2025. For the first five months, it was just me and basic autocomplete — the kind that finishes your sentences but doesn’t really understand what you’re building.
By August, I was drowning. The platform had grown into a sprawling system with cloud infrastructure, databases, APIs, and five different frontend portals. Every time I switched from working on the infrastructure to the frontend to the API, I’d lose 20 minutes just remembering where I’d left off.
I needed a tireless pair programmer who wouldn’t judge my variable names or my commit messages.
Cursor (August 2025)
Cursor was the first proper AI coding assistant I used seriously. It lives inside a code editor — you work with it like a colleague sitting next to you, reading the same files, suggesting changes, taking on tasks.
I set it up with a detailed rulebook: here’s how we name things, here’s how we structure our cloud infrastructure, here’s our testing philosophy. It was, in hindsight, adorably naive — like writing a personality profile for a new hire before they’ve started.
The branch names Cursor generated tell their own story. It would automatically create branches with names like:
cursor/fix-marketing-app-linting-issues-claude-4.5-sonnet-thinking-60ad
Imagine explaining that to someone reviewing your git history in three years.
But it could do real work. One day in November, Cursor’s agent went on a spree and produced 13 commits in a single day, all cleaning up our build pipeline configuration. The commit messages were polished, thorough, and occasionally included lines like “Add extensive debugging for NX JSON parsing issues” — which is exactly the kind of thing you write at 2am when something refuses to work.
The catch: 13 commits meant 13 things for me to review and understand. The AI had no problem generating the work. The bottleneck was me keeping up with it.
Lesson: AI agents can do substantial work quickly. But someone still has to understand and verify everything they produce.
Kiro (September 2025)
Around the same time, I tried Kiro, an AI tool built by Amazon for AWS-heavy projects. I was building an internal changelog system at the time, so an AWS-native tool seemed like a natural fit.
Kiro’s approach was task-list driven: write down exactly what you want done, and it works through the list methodically.
It left exactly one commit in the history: update kiro tasklist.
That’s it. The problem was that writing tasks precisely enough for an AI to execute them correctly took almost as long as just writing the code. For exploratory work — building something new where you don’t quite know what you want yet — that overhead kills you.
The changelog system got built. Kiro didn’t stick. Neither did the changelog system, actually — we removed it entirely a couple of months later.
Lesson: Task-driven AI tools work well when the tasks are well-defined. When you’re figuring things out as you go, the specification overhead isn’t worth it.
Warp and the Document That Outlived Everything (September 2025)
I also started using Warp, a terminal that has AI features built into it. Nothing dramatic happened, but something quietly important did.
To help Warp’s AI understand the project, I wrote a document called WARP.md — a “how this codebase works” guide aimed at an AI reader rather than a human one. It described the project structure, our conventions, our constraints.
Then in October, I renamed it to AGENTS.md.
That rename sounds trivial but it wasn’t. I was acknowledging, without quite having made the decision consciously, that I was going to keep switching tools. The documentation shouldn’t be married to any single one. Whatever came next, the document would survive.
Lesson: The most valuable thing Warp gave me wasn’t its AI features. It was the habit of writing down “here’s how this codebase works” in a structured way. That document is still in use today.
OpenSpec (October — November 2025)
By mid-October I was frustrated with how unstructured AI-assisted development felt. Tasks would start, meander through half-formed ideas, and produce code that solved a slightly different problem than the one I’d actually described.
Enter OpenSpec: a formal system where you write a specification first, then implement against it. The commits from this period have a satisfying rhythm: complete SPEC-1-1 Backend API Development, task: create openspec to fix react query periodic requests. Write the spec, check the spec items off.
But discipline has a cost. I found myself writing commit messages that were half implementation, half bookkeeping — updating task trackers to reflect what I’d just done. For a solo developer, every minute spent maintaining a tracking system is a minute not spent building.
Six weeks after installing it, I removed it.
Lesson: The habit of specifying before implementing is valuable and I kept it. The system for tracking specifications was overhead I couldn’t justify alone.
OpenCode (November 2025)
Fresh off removing OpenSpec, I immediately installed something new. (I see a pattern forming.)
OpenCode was a terminal-based AI tool. Its superpower was mechanical, codebase-wide refactoring: give it a well-defined task and it would work through every relevant file systematically. Its six commits to the project are all variations on one theme — standardising how a particular type of data was represented across the entire codebase. Six commits over two days, methodically fixing the same type of inconsistency in dozens of files.
It was genuinely good at this. The kind of tedious find-and-fix work that’s easy to describe and exhausting to do by hand.
It disappeared from the project by January 2026.
Lesson: Terminal-based AI tools have a lower cost of context-switching than editor-based ones. You don’t have to stop what you’re doing to use them. That insight proved important later.
The BMAD Method (November 2025 — January 2026)
And now we arrive at the most dramatic example of install-remove-reinstall-remove. The BMAD Method was a structured approach to organising development work: Epics, Stories, sprint planning, retrospectives. The full project management ceremony.
The timeline:
| Date | Event |
|---|---|
| Nov 27 | Installed BMAD |
| Dec 7 | Removed BMAD |
| Jan 6 | Reinstalled BMAD |
| Jan 21 | Removed BMAD again |
Ten days the first time. Fifteen the second. Never again.
During the second stint, I was maintaining a sprint-status.yaml file and writing formal retrospectives. For a team of one. I was running project ceremonies designed to coordinate multiple engineers — with no other engineers to coordinate.
Lesson: Process overhead must scale with team size. When you’re a team of one, you don’t need sprint retrospectives. You were there. You know what happened.
The Pattern
Looking back at this stretch from March to January, the same thing happened with every tool:
- Discover new tool, feel excited, set it up
- Productive honeymoon where it handles its obvious use case well
- Overhead creeps in — maintaining the tool takes more time than it saves
- Remove the tool
- Keep whatever habit or insight it taught me
Each tool left something useful behind even after it was gone. But the git log also reveals something uncomfortable: I spent a significant portion of my solo development time installing, learning, and removing tools rather than building the actual product.
The question I kept coming back to: is there a tool that delivers the productivity gain without all the configuration overhead? One that reads AGENTS.md and just works?
In Part 2, I’ll cover what happened when I finally found one — and why, for the first time, the cycle stopped.
← Back to posts