Synchronous billing is easy, and that's the problem — it makes you think all billing is easy.
When a request does its work inline, the billable number is in the response by the time you send it. The gateway meters from there — the meter write, retries and all, is its problem, not yours. From your side, synchronous billing is one number in the response.
Asynchronous work breaks that. The request submits a job; the work happens later, in a worker; the result comes back through a poll or a callback. And the thing you bill for — characters processed, pages converted — isn't known when the request arrives. It's known when the job finishes.
So you can't meter at the edge. The meter has to fire from the completion path. And the real difficulty is firing it exactly once per unit of completed work — because requests, polls, and retries all conspire to make that zero times or many times.
This is platform-agnostic. Every submit-process-poll API has it. I'll use the system I run as the example, but the shape is the same anywhere.
Three ways metering goes wrong
On arrival. Carry the synchronous habit over and you meter when the job is submitted. But you don't know the size yet, so you're forced into a crude flat fee — or you bill for work that hasn't happened and might fail. Wrong unit, wrong time.
On retrieval. The subtle one. You wire the meter to fire when the client fetches the result. Now a client who submits a job, lets it run — costing you real money downstream — and never bothers to poll is never billed. You did the work for free. "Completion" is not "the client picked up the result." It's the worker finishing.
Without a fixed quantity. Input characters or output characters? Pages before OCR or after? If you haven't decided exactly what you measure and where, invoices drift and customers argue. Decide once; measure there.
All three point the same way: meter on measured work-completion, with a fixed definition of the unit. Not on arrival. Not on retrieval.
The mechanism: a durable outbox
In synchronous billing the gateway took the numbers off the response and metered them for you. Async takes that away: the numbers exist only in the worker, after the request has returned. So completion itself has to become a durable event.
The completion path writes a metering task — the job's measured quantities — into a durable outbox: a table that is the source of truth for what still needs sending. Something drains it, sends each task to the meter, records the outcome; a failed send stays in the table and is retried until it lands. (In my system a once-a-minute batch does the draining. The interval doesn't matter; the durability does.)
It has a name — the transactional outbox pattern — though it's the sort of thing you'd build without the name. And it is, exactly, the one rule the rest of the system already runs on: when the job finishes, report it reliably — retry as much as possible, return the result. Metering is just one more result that has to be reported reliably. I didn't build a billing system. I pointed the engine's own discipline at billing.
Exactly once = at-least-once × at-most-once
The outbox gives me at-least-once. A meter event is never silently dropped, because a failed send leaves the task in place to retry.
But at-least-once, on its own, double-charges. The classic failure: the send succeeds, the acknowledgement is lost on the way back, the task looks failed, the next run resends — and now it is counted twice.
So at-least-once needs a partner: an idempotent sink. Send the same meter ID twice, it counts once. That is at-most-once.
exactly-once = outbox (at-least-once) × idempotent sink (at-most-once)
Neither half is enough alone. I learned the second one the hard way — the same outbox-and-retry code, pointed at two different metering backends. One deduplicated on the ID and the numbers stayed clean. The other didn't, and the retries double-charged. Same mechanism, different sink, different bill.
So the thing worth writing down isn't "this platform guarantees idempotency." Platforms change. The durable statement is: this pattern requires an idempotent sink. If yours doesn't deduplicate, your retries are a liability, not a safety net.
Bill on success, and survive retries
Two more places it bites.
Success, not completion. Fire the meter on successful completion — not on "the job reached a terminal state." A failed job must not emit a billable event. Wire it to the wrong terminal state and you charge people for errors, then spend your week on refunds.
Partial failure. What you bill on a half-finished job depends on whether half a result is worth anything. A text extraction fans out into many independent calls; if nine of ten succeed and one fails for good, you bill the nine — the successful work has standalone value. Document conversion is the opposite: a file that converts eight of ten pages and then dies isn't eighty percent of a document, it's a corrupted one. No charge, nothing returned. Bill at the granularity where partial output has value.
Retries. The engine retries aggressively — that is the point of it. Meter per attempt and every retry inflates the bill. So the meter is per successful job, fired once — which is exactly what the outbox and the idempotent sink already guarantee. It is not extra work; it falls out of the same design.
It all reduces to one sentence: the billable event is one successfully-completed unit, counted once.
The shape
In synchronous billing the meter is a property of a request arriving. In asynchronous billing it is a property of work finishing — and the discipline is firing it exactly once per successful unit.
It is worth separating what is hard from what is free. The completion wiring — the outbox, the retries — is yours to build. The at-most-once half is the sink's job, if you chose a sink that does it. Get both, and a client polling ten times, a worker retrying five, and a job that half-failed all resolve to the right number of credits.
That is the whole thing. It isn't much once it's drawn — but every line of it is a place I have watched a bill come out wrong.
Built with Claude (Opus).













