Circuit Breaker Exception Handling Bug for 4XX Refactoring

Problem

When auth-service called member-service and got a 4xx error (e.g. USER_NOT_FOUND on second delete), the circuit breaker treated it as a failure and eventually opened the circuit. This meant legitimate business errors (user not found, duplicate, validation) were incorrectly tripping the circuit breaker.

The same problem existed in web-service calling geospatial-service.

Root Cause: Two Separate Layers

Circuit breaker error handling has two independent layers, and both need to be configured:

Layer 1: YAML Configuration (Failure Counting)

resilience4j:
  circuitbreaker:
    instances:
      MEMBER-SERVICE:
        ignore-exceptions:
          - com.adventuretube.common.client.ServiceClient4xxException

This controls whether the circuit breaker counts the error toward the failure threshold. With ignore-exceptions, 4xx errors don’t increment the failure counter.

Layer 2: Java Fallback (Error Propagation)

return circuitBreaker.run(call, throwable -> {
    if (throwable instanceof ServiceClient4xxException) {
        return Mono.error(throwable);  // Pass through!
    }
    log.error("Circuit breaker open for {}: {}", serviceName, throwable.getMessage());
    return Mono.error(new ServiceClient5xxException(
            serviceName, "CIRCUIT_OPEN",
            serviceName + " circuit breaker is open", 503));
});

The circuitBreaker.run(call, fallback) catches ALL errors (including ignored ones) in the fallback. Without the instanceof check, 4xx errors get swallowed and replaced with CIRCUIT_OPEN — even though they weren’t counted as failures.

Key Lesson: YAML ignore-exceptions only controls counting. The Java fallback controls propagation. Both must be updated together.

Solution: 4xx/5xx Exception Split

New Exception Hierarchy (common-api module)

ServiceClientException (abstract)
├── ServiceClient4xxException   ← business errors, ignored by circuit breaker
└── ServiceClient5xxException   ← infrastructure failures, trips circuit breaker

ServiceClient Changes

All three methods (postServiceResponseReactive, getServiceResponseReactive, getRawReactive) updated:

  • 4xx handler → throws ServiceClient4xxException
  • 5xx handler → throws ServiceClient5xxException
  • Network/timeout errors → throws ServiceClient5xxException
  • Circuit breaker fallbackinstanceof ServiceClient4xxException check to pass through 4xx

ServiceClient Method Naming Convention

Renamed all methods to follow: post/get + Raw/ServiceResponse + Reactive/NonReactive

Method Returns Blocking?
postServiceResponseReactive() Mono<ServiceResponse<T>> No
getServiceResponseReactive() Mono<ServiceResponse<T>> No
getRawReactive() Mono<T> No
postServiceResponseNonReactive() ServiceResponse<T> Yes
getServiceResponseNonReactive() ServiceResponse<T> Yes
getRawNonReactive() T Yes
  • ServiceResponse = inter-service calls that return ServiceResponse<T> wrapper
  • Raw = calls to services not using ServiceResponse (e.g. geospatial-service returns raw JSON)
  • Reactive = returns Mono<> for WebFlux callers
  • NonReactive = .block() wrapper for Spring MVC callers

Web-Service Error Handling (New)

Added full error handling for web-service calling geospatial-service:

New Files

  • WebErrorCode enum — DATA_NOT_FOUND(404), DUPLICATE_KEY(409), SERVER_NOT_AVAILABLE(503), SERVICE_CIRCUIT_OPEN(503), INTERNAL_ERROR(500)
  • BaseServiceException — abstract base with WebErrorCode + auto-captured origin
  • GeoServiceException — concrete exception for geospatial-service errors
  • GlobalExceptionHandler@ControllerAdvice returning structured ServiceResponse

GeoDataService Error Mapping

private GeoServiceException mapServiceClientException(ServiceClientException ex) {
    WebErrorCode errorCode = switch (ex.getErrorCode()) {
        case "DATA_NOT_FOUND", "USER_NOT_FOUND" -> WebErrorCode.DATA_NOT_FOUND;
        case "DUPLICATE_KEY" -> WebErrorCode.DUPLICATE_KEY;
        case "SERVER_NOT_AVAILABLE" -> WebErrorCode.SERVER_NOT_AVAILABLE;
        case "CIRCUIT_OPEN" -> WebErrorCode.SERVICE_CIRCUIT_OPEN;
        default -> WebErrorCode.INTERNAL_ERROR;
    };
    return new GeoServiceException(errorCode);
}

Web-service uses try/catch (blocking getRawNonReactive()) vs auth-service’s reactive .onErrorMap().

Auth-Service Fix: Missing UserNotFoundException Handler

Custom com.adventuretube.auth.exceptions.UserNotFoundException had no @ExceptionHandler — it fell to the catch-all Exception handler returning INTERNAL_ERROR 500 instead of the proper USER_NOT_FOUND 404.

@ExceptionHandler(UserNotFoundException.class)
public ResponseEntity<ServiceResponse<?>> handleUserNotFoundException(UserNotFoundException ex) {
    return buildErrorResponse(ex.getErrorCode(), ex);
}

Verification

Delete User (second time — user already deleted)

  • Zipkin: 3 services, 13 spans, outcome: CLIENT_ERROR, status: 404
  • Response: structured ServiceResponse with errorCode: USER_NOT_FOUND
  • Circuit breaker: stays CLOSED, 0 failed calls
  • Log: MEMBER-SERVICE 4xx error: USER_NOT_FOUND (not CIRCUIT_OPEN)

Commits

  1. 8ecb0a2 — Rename ServiceClient methods for clarity
  2. 18be88f — Add web-service circuit breaker error handling and rename exception classes
  3. 6f25630 — Fix circuit breaker fallback swallowing 4xx exceptions
  4. 7aa8e52 — Add missing UserNotFoundException handler

Files Changed

File Change
common-api/.../ServiceClient.java Method rename + 4xx/5xx split + fallback instanceof check
common-api/.../ServiceClient4xxException.java New — 4xx business errors
common-api/.../ServiceClient5xxException.java New — 5xx infrastructure failures
web-service/.../exceptions/code/WebErrorCode.java New — error codes enum
web-service/.../exceptions/base/BaseServiceException.java New — abstract base
web-service/.../exceptions/GeoServiceException.java New — geospatial errors
web-service/.../exceptions/GlobalExceptionHandler.java New — exception handler
web-service/.../service/GeoDataService.java Added try/catch + mapServiceClientException
auth-service/.../GlobalExceptionHandler.java Added UserNotFoundException handler
config-service/.../auth-service.yml Added ignore-exceptions for 4xx
config-service/.../web-service.yml Added circuit breaker config for GEOSPATIAL-SERVICE

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top