The Core Mechanism
Reactive replaces “thread waits for result” with “register a callback, thread moves on, callback fires when result arrives.”
The critical insight: the thread that starts the work and the thread that finishes it are different threads.
Blocking:
Thread-1: send to Kafka → wait → receive ACK → build response
(same thread, start to finish)
Reactive:
Thread-1: send to Kafka + register callback → goes to serve other requests
Thread-3: Kafka ACK arrives → callback fires → build response
(different threads)
This is possible because the pipeline context (what to do next, who’s waiting) lives on the heap as Subscription objects, not on any thread’s stack. Any thread can pick it up and continue.
The Paradox
Reactive uses more CPU per request but achieves higher CPU utilization overall.
Blocking (8 threads, 100 concurrent requests)
Thread-1: ██░░░░░░░░██░░░░░░░░██░░░░░░░░ (█ = working, ░ = parked/idle)
Thread-2: ██░░░░░░░░██░░░░░░░░██░░░░░░░░
Thread-3: ██░░░░░░░░██░░░░░░░░██░░░░░░░░
...
Thread-8: ██░░░░░░░░██░░░░░░░░██░░░░░░░░
CPU utilization: ~20% (threads spend 80% of time parked)
92 requests: stuck in queue, waiting for a free thread
Reactive (8 threads, 100 concurrent requests)
Thread-1: ████████████████████████████████
Thread-2: ████████████████████████████████
Thread-3: ████████████████████████████████
...
Thread-8: ████████████████████████████████
CPU utilization: ~95% (small overhead for heap lookups/scheduling)
All 100 requests: being processed concurrently
Per Request vs Overall
| Per Request | Overall | |
|---|---|---|
| Blocking | Cheap (no overhead) | Wasteful — threads parked, CPU idle |
| Reactive | Expensive (heap lookups, scheduling) | Efficient — threads always doing useful work |
Analogy
Blocking is like hiring 8 delivery drivers but each one waits at the restaurant doing nothing until the food is ready. Low effort per delivery, but most of the time they’re just standing around.
Reactive is like 8 drivers constantly picking up and dropping off orders — slightly more coordination overhead (check the dispatch board, figure out which order), but nobody is ever standing idle.
The reactive overhead (Subscription heap lookups, scheduler queue operations, context propagation) is real CPU work that blocking doesn’t need. But that cost is microseconds compared to milliseconds of idle thread time saved. You’re trading micro-overhead for macro-efficiency.
When To Use Which
Reactive wins: I/O-heavy work (cross-process boundary)
Your service → Kafka broker (network)
Your service → MongoDB (network)
Your service → another microservice REST call (network)
Your service → file system (disk)
Every arrow is a process boundary where a blocking thread would park and waste resources. More arrows = more idle time = more benefit from reactive.
Reactive loses: CPU-bound work (inside process boundary)
Parse JSON, calculate geospatial distance, compress image, encrypt data
No waiting. Thread is doing useful work the entire time. Reactive just adds overhead with zero idle time to reclaim.
Decision Rule
Is most of the request time spent waiting on external systems?
- YES → Reactive (free up threads during the wait)
- NO → Blocking (simpler code, no overhead)
Real Example: AdventureTube geospatial-service
POST /geo/save (reactive is ideal)
Client →[network]→ Controller →[network]→ Kafka →[network]→ Consumer →[network]→ MongoDB
Actual CPU work: serialize JSON (~0.1ms)
I/O waiting: Kafka + MongoDB (~10-100ms)
Ratio: 99% waiting, 1% computing → reactive is ideal
Image processing service (blocking is better)
Client → Controller → resize/compress image (50ms pure CPU) → return
Ratio: 95% computing, 5% I/O → blocking is simpler and faster
