"Three nines" sounds like the service is basically always up. It is not. 99.9% uptime gives you 43 minutes of downtime every month, and one bad deploy on a Tuesday can spend the whole budget before lunch.
Most people never do the arithmetic, so the percentages stay abstract. Here is every common availability target turned into actual time you are allowed to be down.
99.9% is 43 minutes a month. 99.99% is 4 minutes. Every extra nine costs you roughly 10x the effort.
The nines, in plain time
| Availability | Per year | Per month | Per week | Per day |
|---|---|---|---|---|
| 99% (two nines) | 3.65 days | 7h 12m | 1h 41m | 14m 24s |
| 99.5% | 1.83 days | 3h 36m | 50m 24s | 7m 12s |
| 99.9% (three nines) | 8h 46m | 43m 12s | 10m 5s | 1m 26s |
| 99.95% | 4h 23m | 21m 36s | 5m 2s | 43s |
| 99.99% (four nines) | 52m 34s | 4m 19s | 60s | 8.6s |
| 99.999% (five nines) | 5m 15s | 26s | 6s | 0.9s |
Rounded, assuming a 365-day year and a 30-day month. The math is just (1 - availability) * length of the window. That is the whole formula. 99.9% means 0.1% of the window is allowed to be downtime, so over a 30-day month that is 0.001 times 43,200 minutes, which lands at 43.2 minutes.
The actual times land harder than the percentages do.
99% looks respectable as a number and is genuinely bad as a service. Three and a half days a year is a long weekend of your product being down. Nobody would tolerate that, yet "99%" reads fine in a slide.
The jump from three nines to four nines is brutal. You go from 43 minutes a month to 4 minutes a month. That last 0.09% costs you redundancy, failover, on-call rotations, and a lot of money. Each extra nine roughly multiplies the engineering effort while shrinking your room for error by 10x.
Five nines is 26 seconds a month. At that point a single slow deploy or one bad health check blows your entire budget. Most teams that claim five nines are either measuring something narrow or not measuring honestly.
The trap: the number you advertise is not the number you measure
There are two different uptime numbers and they almost never match.
The first is the SLA you promise, a yearly figure, the one on the pricing page. The second is what you actually measured this month. They drift apart for reasons that have nothing to do with how reliable your service really was.
Window matters more than the percentage. A single 50-minute outage is a non-event against a yearly 99.9% budget of nearly 9 hours. That exact same outage blows straight through a monthly 99.9% budget of 43 minutes. Same outage, same service, two different verdicts, purely because of which window you measured. Vendors love quoting the yearly number because it hides the months that hurt.
What counts as "down" matters just as much. Was a single failed health check downtime, or noise? Did a 30-second blip in one region count against you? If your monitor records every flap as an outage, your measured uptime will look worse than the service actually was. If it averages everything into a smooth ratio, it will look better. Neither is the real number, and the gap between them is the difference between a monitor you trust and one you have learned to ignore.
Treat your downtime allowance as a budget
Think of your monthly allowance as money you get to spend rather than a target to perfect.
99.9% monthly gives you 43 minutes. That is your error budget, and everything draws it down: planned maintenance, a failed deploy, a dependency that went slow for an afternoon. Frame it that way and the question stops being "how do we never go down" and becomes "what do we want to spend these 43 minutes on," which is a far more honest conversation and the one good SRE teams actually have.
It also kills the instinct to chase more nines than you need. A side project does not need four nines. An internal tool does not need four nines. Adding a nine you will never use is just paying for redundancy nobody asked for. Pick the target your users actually feel, then measure against it on the window they actually live in, which is usually the month.
So what is your number
Find your real measured uptime for last month, not the aspirational figure from the pricing page. Then check it against the table. If you do not know it to the minute, that is the actual thing to fix first, because you cannot spend a budget you never counted.
I build uptimepage, an open source, self-hostable uptime and status page monitor. It computes uptime from confirmed incidents over whatever window you pick, so the number you see is the one your users actually felt, not a ratio of green checks. The source is right there if you want to see how the math is done.












