Enter fullscreen mode

Rate limiting sounds simple at first, until you try to make it work across multiple server instances without breaking consistency.

I wanted to build something beyond the usual “store a counter in memory and block after 5 requests” demo. So I built APIShield, a distributed rate limiting system that works across multiple backend instances while enforcing limits consistently using Redis, Lua scripting, and MongoDB-backed dynamic rules.

This project ended up being a lot more interesting than I expected. It pushed me into questions around atomic operations, shared state, rule prioritization, and how to make a system feel closer to a production service than a middleware experiment.

In this post, I’ll walk through the architecture, the rate limiting strategies I implemented, and why I ended up moving the sliding window logic into Redis using Lua.

GitHub Repository: APIShield

What is APIShield?

APIShield is a distributed rate limiting system that supports:

Fixed Window
Sliding Window
Token Bucket
Dynamic rule management
Violation tracking
Admin dashboard for rule configuration

The main goal was to build a rate limiter that could run across multiple backend instances while still applying limits consistently, even when requests hit different servers.

Architecture

The system looks like this:

Client
   │
   ▼
Backend Instance 1  ─┐
Backend Instance 2  ─┼──> Redis (Centralized Enforcement)
                     │
                     └──> MongoDB (Rule Storage)

Why this setup?

If each backend instance keeps its own counters in memory, the rate limit breaks the moment traffic is distributed across servers.

For example:

Backend 1 sees 3 requests
Backend 2 sees 3 requests
Limit is 5 requests

If both servers count independently, the client effectively gets 6 requests instead of 5.

That’s why Redis acts as the centralized enforcement layer. Every backend instance checks and updates the same counters, so the limit stays consistent regardless of which instance receives the request.

MongoDB stores dynamic rate limit rules, which means limits can be configured without redeploying the backend.

The algorithms I implemented

1. Fixed Window

This is the simplest strategy.

A counter is maintained for a fixed time window, such as:

100 requests per minute
1000 requests per hour

Once the window resets, the counter resets too.

Pros

Easy to implement
Low memory overhead
Fast

Cons

It has the classic boundary burst problem.

A client could send:

100 requests at 12:00:59
another 100 at 12:01:01

and effectively bypass the intended smoothness of the limit.

2. Sliding Window

To make enforcement more accurate, I implemented Sliding Window using Redis Sorted Sets.

Instead of counting requests in fixed buckets, the system stores request timestamps and continuously checks how many requests happened within the last N seconds.

At a high level, the logic is:

Remove timestamps older than the active window
Add the current request timestamp
Count the remaining timestamps
Block the request if the count exceeds the limit

This gives much more accurate rate limiting than fixed windows.

The problem with a naive implementation

My first approach was the straightforward one: perform the sliding window logic using multiple Redis commands from Node.js.

That meant every request required a sequence like:

remove expired entries
add current request
count requests
set expiry

It works, but it also means multiple Redis round trips per request. At small scale that’s acceptable. At higher traffic volumes, it starts to feel wasteful — especially because the whole operation really needs to be treated as one unit.

That’s what led me to Lua.

Why I used Lua for Sliding Window

To reduce latency and make the sliding window logic atomic, I moved it into a Lua script executed directly inside Redis.

Instead of making multiple Redis calls from Node.js, the backend sends a single script to Redis, and Redis performs the entire rate-limiting workflow in one go.

That solved three problems at once:

1. Atomicity

There’s no gap between removing old requests, inserting the new one, and checking the count. That avoids race conditions when multiple requests arrive close together.

2. Fewer network round trips

The logic runs directly inside Redis, so the backend doesn’t need to orchestrate multiple calls for every request.

3. A more production-friendly implementation

It made the sliding window approach feel much less like a demo and much more like something I’d be comfortable using in a real distributed service.

This was probably my favorite part of the project because it changed the sliding window implementation from “technically correct” to “something that actually scales more gracefully.”

3. Token Bucket

I also implemented Token Bucket to support smoother rate limiting behavior.

In this model:

a bucket has a maximum token capacity
tokens refill over time
every request consumes one token
if no tokens remain, the request is blocked

This is useful when you want to allow short bursts of traffic without removing limits entirely.

Compared to fixed windows, it feels much more natural for APIs where occasional bursts are acceptable but sustained abuse is not.

Dynamic Rule Engine

One of the things I didn’t want was a hardcoded rate limiter with a single global limit.

So I built a dynamic rule engine where rules are stored in MongoDB and managed through an admin dashboard.

Each rule can define things like:

target → ip or user
scope → global or endpoint
algorithm → fixed window / sliding window / token bucket
limit and window configuration
active / inactive state

That makes the system much more flexible because different limits can be applied to different scenarios instead of forcing one blanket rule across everything.

Rule Resolution Priority

When multiple rules could match the same request, I added a priority order:

Endpoint + User
Endpoint + IP
Global + User
Global + IP
Fallback Rule

This allows the system to support more realistic cases.

For example:

a premium user can have a higher limit on a specific endpoint
anonymous traffic can be limited by IP
everything else can still fall back to a global default rule

Fallback Protection

I also didn’t want a missing rule to mean unlimited access.

So if no specific rule matches a request, APIShield applies a safe fallback limit. That acts as a default protection layer and prevents accidental gaps in enforcement.

Violation Tracking

Blocking requests is useful, but I also wanted visibility into who was repeatedly crossing limits.

So I added violation tracking in Redis for:

per-user violations
per-IP violations

This made it possible to surface analytics in the dashboard and inspect how the system was actually being used.

Admin Dashboard

The project also includes an admin dashboard where rules can be:

created
updated
deleted
toggled on/off

and where rate limit violations can be monitored.

I liked this part because it turned the project from “a rate limiting middleware” into something that felt more like an internal platform.

Dockerized setup

The whole system runs with Docker Compose using separate services for:

redis
mongo
backend1
backend2
dashboard

This made it much easier to validate distributed behavior locally.

One of the most satisfying tests was applying a global limit, sending alternating requests to both backend instances, and watching the 6th request get blocked even though the requests were split across servers.

That was the point where the system actually felt distributed instead of just pretending to be.

What this project taught me

APIShield taught me a lot more than just how to implement rate limiting.

It forced me to think about:

how state should be shared across multiple backend instances
when in-memory logic stops being enough
how to reduce race conditions in distributed systems
when to move logic closer to the data store
how to design backend systems that are configurable instead of hardcoded

It also reminded me that some of the most interesting engineering problems aren’t always flashy user-facing features. Sometimes they’re the quiet infrastructure pieces that keep everything else stable.

Final thoughts

Rate limiting is one of those things users rarely notice when it works well, but systems definitely notice when it doesn’t.

Building APIShield gave me hands-on experience with distributed backend design, Redis scripting, and the tradeoffs behind different rate limiting strategies. It also made me appreciate how much engineering depth can hide behind a feature that, on the surface, sounds as simple as “block requests after a certain limit.”

If I continue iterating on this project, I’d love to explore:

richer analytics dashboards
per-plan throttling for SaaS-style use cases
Redis Cluster support
stronger observability and failure-mode handling
benchmarking different strategies under load

If you’ve built something similar, or would approach rate limiting differently, I’d love to hear your thoughts.

If you want to explore the implementation in more detail, the project is here:

GitHub: APIShield

How I Built a Distributed Rate Limiter with Redis, Lua, and Docker

What is APIShield?

Architecture

Why this setup?

The algorithms I implemented

1. Fixed Window

Pros

Cons

2. Sliding Window

The problem with a naive implementation

Why I used Lua for Sliding Window

1. Atomicity

2. Fewer network round trips

3. A more production-friendly implementation

3. Token Bucket

Dynamic Rule Engine

Rule Resolution Priority

Fallback Protection

Violation Tracking

Admin Dashboard

Dockerized setup

What this project taught me

Final thoughts

Tags

Author

Stats

Published

You Might Also Like

. .. . ... . .... . .... . ... .

7 New JavaScript Features (And 2 I'm Still Waiting For)

Too cheap to be good? Think again.

What Does the Windows REFRESH button really do?

👾 Server Access Logs with GoAccess

The Node.js Mistake That Cost My Client $3,000 in AWS Bills