Synchronous Code in Reactive Chains — Why, Where, and When It’s Safe

Date: 2026-02-27


Context

After converting auth-service from Servlet (Spring MVC) to WebFlux, 80-90% of the code is reactive/async. But 10-20% of synchronous code still exists and MUST exist — JWT signing, password hashing, object building, etc. This note explains why that’s correct, how to handle it, and when it becomes dangerous.


Why Synchronous Code Must Live Inside .map() / .flatMap()

The reactive chain is lazy

A Mono is a declaration, not an execution. Nothing happens until something subscribes. This means sync code outside the chain runs at the wrong time:

// WRONG — breaks the reactive flow
public Mono<Response> login(Request req) {
    Mono<MemberDTO> member = serviceClient.getReactive(...);  // declares async I/O (not executed yet)

    String jwt = jwtUtil.generate(email);  // runs IMMEDIATELY, before member response arrives!

    return member.map(m -> buildResponse(jwt));
}

The sync code must be inside the chain to run at the correct moment:

// CORRECT — sync code runs after async I/O completes
public Mono<Response> login(Request req) {
    return serviceClient.getReactive(...)           // Step 1: call member-service (async I/O)
        .map(member -> {
            String jwt = jwtUtil.generate(           // Step 2: runs AFTER member response arrives
                member.getEmail(),
                member.getRole()
            );
            return buildResponse(jwt);               // Step 3: immediately after JWT
        });
}

Two reasons sync code goes inside .map() / .flatMap()

  1. Timing — It runs at the correct moment in the sequence, after the previous async step completes
  2. Data flow — It receives the result from the previous step as input (member is the resolved value from the I/O call)

.map() vs .flatMap()

  • .map() — sync code that returns a plain value (String, DTO, etc.)
  • .flatMap() — code that returns another Mono (another async I/O operation)
.map(member -> jwtUtil.generate(member.getEmail()))  // returns String (plain value)
.flatMap(jwt -> serviceClient.postReactive(...))     // returns Mono<ServiceResponse> (async I/O)

Why .flatMap() “flattens”

If you use .map() where you should use .flatMap(), you get a nested Mono:

// .map() does NOT unwrap — broken
.map(member -> tokenRepository.save(token))
// Result: Mono<Mono<Token>>  ← Mono inside Mono

// .flatMap() unwraps — correct
.flatMap(member -> tokenRepository.save(token))
// Result: Mono<Token>  ← flat

That’s what “flat” means — .map() keeps the wrapper as-is, .flatMap() flattens the nested wrapper.

Simple rule

  • Sync job.map() (returns a value)
  • Async job.flatMap() (returns a Mono)
serviceClient.getReactive(...)              // async I/O
    .map(member -> jwtUtil.generate(...))   // sync → returns String
    .flatMap(jwt -> storeToken(jwt))        // async → returns Mono<Token>
    .map(token -> buildResponse(token))     // sync → returns Response

The Async Infection Rule

Once an async boundary exists, every piece of code that depends on its result must stay inside the reactive chain. You cannot extract a value from a Mono back into synchronous code:

// IMPOSSIBLE — can't get the value out of Mono
public Response login(Request req) {
    Mono<MemberDTO> memberMono = serviceClient.getReactive(...);

    MemberDTO member = ???  // No way to get this without .block()

    String jwt = jwtUtil.generate(member.getEmail());  // needs the value!
    return buildResponse(jwt);
}

The async boundary propagates upward through the entire call stack:

serviceClient.getReactive()  →  returns Mono
                                    ↓
                              service method must return Mono
                                    ↓
                              controller must return Mono
                                    ↓
                              WebFlux framework subscribes

This is why:

  • Method return types change from Response to Mono<Response>
  • Controller returns Mono<ResponseEntity<?>>
  • The entire call chain becomes reactive
  • All sync code that depends on async results must live inside .map() / .flatMap()

The only escape hatch is .block() — but it’s forbidden on Netty event loop threads (throws IllegalStateException) and defeats the purpose of WebFlux.


The Netty Event Loop — Why Time Matters

WebFlux runs on Netty with a small thread pool (~CPU cores x 2, typically 8-16 threads). These threads handle all concurrent requests:

Netty Event Loop Threads (e.g., 8 threads handling thousands of requests)

Thread-1: [Request A: quick work] → [send to DB, release] → ... → [pick up result] → [quick work] → done
Thread-2: [Request B: quick work] → [send HTTP, release] → ... → [pick up result] → done
Thread-3: [Request C: quick work] → [send to DB, release] → ...

Each thread processes a tiny bit of a request, releases during I/O, then picks up the result later. This is how WebFlux handles thousands of concurrent requests with just a few threads.

If sync code inside .map() takes too long, it holds the thread hostage — other requests waiting for that thread are stuck.


The Three Caveats

1. Short CPU Work — SAFE

.map(member -> jwtUtil.generate(member.getEmail()))  // ~2ms

Thread is busy for ~2ms. Like a car briefly slowing down on a highway — doesn’t block traffic.

2. Expensive CPU Work — RISKY

.filter(user -> passwordEncoder.matches(password, user.getPassword()))  // 100-500ms!

BCrypt is intentionally slow (security feature). Holds the event loop thread for 100-500ms. If 16 requests hit this simultaneously, ALL event loop threads are blocked and the server freezes.

3. Blocking I/O — NEVER

.map(data -> restTemplate.getForObject("http://..."))  // 200ms+ BLOCKED

Thread held doing nothing (waiting for network). With only 8-16 threads, a few concurrent blocking calls can starve the entire server. This destroys the purpose of WebFlux.

Summary Table

Inside .map() / .flatMap() Time Impact OK?
JWT signing (HMAC) ~2ms Negligible YES
Object building (DTO.builder) ~0.1ms None YES
URI construction ~0.1ms None YES
JSON parsing ~1ms Negligible YES
BCrypt hashing/matching 100-500ms Blocks thread RISKY
HTTP call (RestTemplate) 200ms+ Blocks thread NO
JDBC query 10-100ms Blocks thread NO
File I/O (disk read) 10-100ms Blocks thread NO

Auth-Service Sync Code Inventory

SAFE — Fast CPU work inside reactive chain

Code Location Operator Time
JWT generation (jwtUtil.generate) AuthService lines 116-117, 170-171, 254-255 .flatMap() ~2ms
JWT extraction (jwtUtil.extractUsername) AuthService lines 252-253 .flatMap() ~1ms
Object building (TokenDTO.builder) AuthService lines 120-126, 173-179, 257-264 .flatMap() ~0.1ms
URI building AuthController lines 92-94 .map() ~0.1ms

RISKY — Expensive CPU work (NOT yet using Mono.fromCallable + boundedElastic)

Code Location Operator Time Risk Status
passwordEncoder.matches() AuthServiceConfig line 47 .filter() 100-500ms HIGH — every login blocks event loop TODO — offload to boundedElastic()
passwordEncoder.encode() AuthService buildMemberDTO() line 362 pre-chain (used in .flatMap()) 100-500ms HIGH — registration blocks event loop TODO — offload to boundedElastic()

MEDIUM RISK — Potential blocking I/O (NOT yet using Mono.fromCallable + boundedElastic)

Code Location Operator Time Risk Status
verifier.verify() (Google token) AuthService lines 336-340 pre-chain Variable May make HTTPS calls to fetch Google JWKS keys TODO — offload to boundedElastic()

As of 2026-02-28, the codebase has zero instances of Mono.fromCallable() or Schedulers.boundedElastic(). The risky blocking operations above run directly on Netty event loop threads. Works fine at low traffic, but under concurrent load could starve the event loop. Future improvement needed.


Key Takeaway

Synchronous code inside .map() / .flatMap() is not a workaround or compromise — it’s the correct and only way to sequence sync operations within a reactive pipeline. The reactive chain guarantees execution order and data flow. Putting sync code outside the chain breaks both.

The rule: sync code inside reactive operators must be fast enough that you’re not holding the event loop thread hostage. Pure computation (JWT, object mapping, string manipulation) is fine. Anything that waits (network, disk, intentionally slow crypto like BCrypt) is dangerous.


How to Handle Blocking Code — Mono.fromCallable() + Schedulers.boundedElastic()

When you have blocking code that cannot be replaced with a non-blocking alternative (like BCrypt — there is no async BCrypt), the pattern is:

// Wrap blocking code in Mono.fromCallable() and offload to a worker thread pool
.flatMap(data ->
    Mono.fromCallable(() -> blockingOperation())      // wrap blocking call into a Mono
        .subscribeOn(Schedulers.boundedElastic())      // run on worker pool, NOT Netty event loop
)

Why this works

  1. Mono.fromCallable() — wraps any synchronous/blocking code into a Mono (makes it part of the reactive pipeline)
  2. .subscribeOn(Schedulers.boundedElastic()) — moves execution to a separate thread pool designed for blocking work (auto-expanding, up to 100+ threads)
  3. The Netty event loop thread is free immediately — it doesn’t wait
Netty Thread:  [request] → [flatMap: hand off to worker] → FREE → [handles other requests]
                                    ↓
Worker Thread:                [BCrypt/File I/O/etc.] → done → result flows back to pipeline

Example: BCrypt fix

// CURRENT (RISKY — BCrypt runs on Netty event loop, blocks 100-500ms)
.filter(user -> passwordEncoder.matches(password, user.getPassword()))

// FIXED — BCrypt runs on worker thread pool
.filterWhen(user ->
    Mono.fromCallable(() -> passwordEncoder.matches(password, user.getPassword()))
        .subscribeOn(Schedulers.boundedElastic())
)

Example: File I/O fix

// WRONG — Files.readString() blocks the Netty thread
.map(data -> Files.readString(path))

// WRONG — flatMap alone doesn't help, Files.readString() doesn't return Mono
.flatMap(data -> Files.readString(path))  // compile error!

// CORRECT — wrap and offload
.flatMap(data ->
    Mono.fromCallable(() -> Files.readString(path))
        .subscribeOn(Schedulers.boundedElastic())
)

Changing .map() to .flatMap() alone does NOT fix blocking — the blocking operation itself is the problem. .flatMap() is only needed because Mono.fromCallable() creates a Mono that needs to be unwrapped.

When to use this pattern vs replacing the library

Blocking code Fix
BCrypt Mono.fromCallable() + Schedulers.boundedElastic() (no async alternative exists)
RestTemplate Replace with WebClient (ServiceClient already does this)
JDBC Replace with R2DBC (member-service already does this)
File I/O Mono.fromCallable() + Schedulers.boundedElastic()
Google verifier.verify() Mono.fromCallable() + Schedulers.boundedElastic() (Google library is blocking)

The Complete Rule — What Goes Where

Code type Returns Thread doing… Use
Fast CPU sync (JWT, object building) Value Computing (busy, fast) .map()
Expensive CPU sync (BCrypt) Value Computing (busy, slow) Mono.fromCallable() + boundedElastic() in .flatMap()
Blocking I/O sync (Files, JDBC) Value Waiting (idle) Mono.fromCallable() + boundedElastic() in .flatMap()
Non-blocking async (WebClient, R2DBC) Mono Released (free) .flatMap()

“Sync job → .map()” is shorthand. The full rule: fast CPU sync → .map(). Anything slow or waiting → offload with Mono.fromCallable() + Schedulers.boundedElastic(), or replace with a non-blocking alternative.


Deep Dive: Netty Event Loop vs boundedElastic Worker Thread Pool

Both thread pools use the same CPU. The difference is pool design based on expected work duration, not what hardware they run on.

Both Use CPU — So How Does It Work?

The OS time-slices CPU access across all threads. No thread gets exclusive access to a core:

CPU Core 1 (time-slicing):
  [Event Loop Thread-1: 0.1ms] → [BoundedElastic Thread-42: 0.5ms] → [Event Loop Thread-1: 0.1ms] → ...

Each thread gets a small time slice (~1-10ms). Even when boundedElastic threads are doing BCrypt, event loop threads still get their turns.

Why This Works — Event Loop Threads Are Mostly Idle

Event loop threads need very little CPU time. A typical request processing step takes ~0.1ms. In a 1ms time slice, an event loop thread can process ~10 request steps. They spend 99% of their time idle, waiting for I/O responses (WebClient, DB).

8-core machine, worst case: 16 event loop + 16 BCrypt threads = 32 threads on 8 cores

Event loop threads:
  - Need 0.1ms per task → get plenty done in each time slice
  - Spend most time IDLE (waiting for I/O responses)

BCrypt threads:
  - Need 300ms continuous CPU → spread across many time slices
  - BCrypt takes 300ms → maybe 350ms (slightly slower, negligible)

The Restaurant Analogy

Event Loop  = 2 waiters taking orders, delivering food (fast, always moving)
boundedElastic = 10 kitchen cooks doing slow work (chopping, grilling)

If a waiter stops to cook a steak (300ms BCrypt):
  → No one taking orders → customers wait → restaurant stalls

Better: waiter hands order to cook, goes back to serving
  → Cook is busy, but waiters keep moving → restaurant runs fine

The Key Comparison

Property Netty Event Loop boundedElastic
Thread count ~16 (fixed, small) ~100+ (expandable)
Expected task duration Microseconds to low milliseconds Tens to hundreds of milliseconds
Purpose Handle ALL incoming requests Handle only blocking work
If all threads busy Entire server freezes Only blocking ops queue up
CPU usage pattern Tiny bursts, mostly idle Sustained, expected to block

Before vs After the Fix

BEFORE (BCrypt on event loop):
  16 event loop threads → 16 concurrent BCrypt → ALL threads blocked 300ms → SERVER DEAD

AFTER (BCrypt on boundedElastic):
  16 event loop threads → still processing requests (0.1ms tasks, mostly idle)
  16 boundedElastic threads → doing BCrypt (slightly slower, nobody cares)

The difference: server dead vs server responsive with slightly slower BCrypt.

When Would CPU Sharing Become a Problem?

If hundreds of concurrent BCrypt operations saturated all CPU cores at 100%, event loop threads would slow down. But in practice:

  • boundedElastic has a thread cap (~100 by default)
  • Your server likely isn’t doing 100 simultaneous logins
  • If it is, you’d scale horizontally (more instances) rather than solving it on one machine

Implementation Status (2026-02-28)

All three blocking operations have been offloaded to Schedulers.boundedElastic():

  1. passwordEncoder.matches().filter() changed to .filterWhen() with Mono.fromCallable() (AuthServiceConfig)
  2. passwordEncoder.encode() — wrapped in Mono.fromCallable() at start of createUser() chain (AuthService)
  3. verifier.verify() — wrapped in Mono.fromCallable() at start of createUser() and issueToken() chains (AuthService)

Branch: feature/distributed-tracing

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top