Your Kafka consumer is processing 800 events/sec.

The producer just hit 5,000 events/sec and it’s not slowing down.

Lag chart: 12 minutes behind and climbing. Consumer memory: 89% and rising. The on-call alert just fired. You have ~4 minutes before the JVM starts GC-thrashing and the pod gets OOM-killed.

Here’s the setup:

Producer → Kafka topic (5K events/sec, growing)

Consumer → spring-kafka @KafkaListener, batch=500, processes ~800 events/sec

Downstream → Postgres write + external HTTP call (the real bottleneck)

SLA → events must be processed, not silently dropped

The consumer can’t keep up. The producer doesn’t know it. What do you do?

A) Drop events on the floor — fail fast, return early, let the lag burn down. The system stays alive.

B) Block the producer — make the consumer signal “slow down,” apply backpressure upstream until it catches up.

C) Buffer harder — bigger in-memory queue, larger batch size, scale the consumer to absorb the spike.

D) Rate-limit + load-shed — cap consumption rate, route the overflow to a DLQ or secondary topic for later replay.

Three of these are real production patterns. Only one of them actually fits this stack and this SLA.

Pick one — A, B, C, or D — and tell me why. Full breakdown in the comments (including why two of the wrong answers will fool engineers who’ve shipped Kafka before).

If your team argues about backpressure in standups, share this with them. The right answer is platform-specific, and most posts get it wrong.

Drop your answer 👇

30DaysOfSystemDesign #Day17 #SystemDesign #DistributedSystems

17/30 Days System Design Questions!

30DaysOfSystemDesign #Day17 #SystemDesign #DistributedSystems

Tags

Author

Stats

Published

You Might Also Like

From TLEs to Real-Time Satellite Tracking: Building an Orbital Backend with Spring Boot and Orekit

Polymarket Architecture Deep Dive 2026: Hybrid CLOB + CTF Design Every Trading Bot Must Understand

Trade-offs in Indexing Solana at Scale

[System Design] GraphHopper Distance Matrix: Self-Host OSRM vs Haversine for Route Optimization

AI Agents Today Aren't Secure. They're Just Clumsy

The Hybrid Architecture: Blending Physical IoT with Cloud Computing