What I learned splitting a solo project into microservices (so you don't have to learn it the hard way)

I didn't plan to write "microservices" on a personal project. It started as a wellness app with streaks and reminders, the kind of thing you can absolutely ship as one Express server. Then I added push notifications, payments, and video streaming, and one server started feeling less like a clean idea and more like a pile of unrelated problems sharing the same crash radius.

So I split it. Not because some blog told me to "start with a monolith and migrate later" — I skipped that step on purpose, because the parts of the app genuinely didn't behave like the same kind of software.

Why one big server didn't sit right

Auth has to basically never go down. Notifications are bursty and don't care if they're a few seconds late. Payments need to be boring and predictable, zero room for a race condition. Video delivery is bandwidth-heavy and latency-sensitive in a way none of the rest is.

Those are four different jobs wearing the same hoodie. I kept hitting bugs where a slow notification batch would make the API feel sluggish for someone just trying to log in, which is a dumb reason to lose a user.

How it's put together now

Everything goes through one gateway first — every request, whether it's someone opening their dashboard or Stripe pinging a webhook. That one decision saved me more headaches than anything else: one place for rate limiting, one place for token checks, and I can swap out a service behind it without the frontend ever knowing.

Auth runs on Firebase. I went back and forth on this — rolling your own auth feels like a rite of passage, but honestly, unless auth is the actual product, it's not where a solo dev should be spending nights. Tradeoff is real lock-in; if I ever leave Firebase that's a proper migration, not a config flag. I decided that was a future-me problem worth having.

Notifications run through FCM, and this is where I learned the most, honestly more than I wanted to. Push delivery is not a guarantee. Tokens go stale, phones go offline, and if you treat sending a notification like a normal API call you will eventually fire off a batch job that just quietly fails for a chunk of users and you won't notice for days. I ended up rebuilding this as a queue with retries instead of a loop that fires and hopes.

Payments sit on Stripe, deliberately walled off from everything else. I didn't want streak logic and billing logic anywhere near the same file — that's exactly the kind of setup where a harmless gamification bug turns into someone getting double charged. Webhooks land on their own route, get verified, then update state.

Video/audio content is served through a CDN provider rather than anything self-hosted. I briefly considered building my own streaming setup. I do not recommend this to anyone. It's a full-time job hiding inside what looks like a side project.

The stuff that actually broke

This is usually the part people skip, so here's what actually went wrong:

A Firebase token expiring mid-session used to just silently kill requests instead of prompting a clean re-auth — took embarrassingly long to figure out it wasn't a "random bug," it was just expiry timing.

A reminder job that looped through users one by one instead of queuing them ended up tripping FCM's rate limits and taking the whole notification service down with it. Classic "works fine with 10 test users, falls over at 2000."

Stripe retries webhooks if it doesn't get a fast enough response, and my first webhook handler wasn't idempotent. A retried webhook reprocessed the same event twice. Took a confusing afternoon of double-counted subscription states before I added an idempotency check at the top of that handler.

If I started over, I'd build the queue and the idempotency check on day one instead of bolting them on after they broke. Not because "best practices say so" but because I personally paid the debugging tax for skipping them.

Deployment had its own special pain. I was on DigitalOcean's App Platform first, and it kept failing back to back with no clear error — couldn't tell if it was Docker, the CI/CD step, or something wrong in my own YAML. Just vague failed builds, over and over. After enough of that black-box guessing, I switched to a plain Droplet instead, where I could actually see logs and figure out what was breaking.

Deployment had its own mess too. I was trying to deploy on DigitalOcean's App Platform and kept getting back-to-back failed builds with zero useful detail — no clear signal whether it was Docker, the CI/CD pipeline, or something in my own YAML file. Just generic failure logs, retry, fail again. After burning enough time guessing, I gave up on App Platform and moved to a plain Droplet instead, where I could actually SSH in and see what was breaking instead of staring at a black box.

What I'd actually tell another solo dev

Microservices get a bad rap for solo projects, and most of the time that's fair — splitting things up because it looks impressive on a portfolio is a waste of your own time. But sometimes the parts of your app really are different shapes of problem, and forcing them into one process just means one bug domain can take down an unrelated one.

The skill isn't knowing how to draw the diagram. It's noticing which boundaries are actually load-bearing and which ones you're adding because it feels more "real engineer" to do so.

If you're building something similar or want to argue with any of this, I'm on X and the code's on GitHub.