1. Why Kafka Over Other Message Brokers
AdventureTube uses Apache Kafka for asynchronous messaging between microservices. Here’s why Kafka was chosen over alternatives like RabbitMQ or JMS.
Kafka vs RabbitMQ — Key Differences
| Feature | RabbitMQ | Kafka |
|---|---|---|
| Model | Queue-based (point-to-point) | Topic-based (pub/sub) |
| Message retention | Deleted after consumed | Stays in topic — multiple consumers read independently |
| Scaling | Clustering added later on top of single-node design | Designed from day one as a distributed system |
| Replication | Requires explicit queue mirroring config | Built-in topic replication across brokers |
| Partitioning | Not native | Topics split into partitions for parallel processing |
| Spring Boot support | spring-boot-starter-amqp |
spring-kafka (auto-configures KafkaTemplate) |
The Pub/Sub Advantage
This is the biggest reason. In RabbitMQ, a message goes to one consumer and is removed from the queue:
Producer → Queue → Consumer A reads Message 1 (gone forever)
Consumer B reads Message 2
In Kafka, messages stay in the topic. Every consumer reads independently:
Producer → Topic → Consumer A reads Message 1, 2, 3
Consumer B reads Message 1, 2, 3
Consumer C reads Message 1, 2, 3
This matters for AdventureTube because when geospatial-service publishes an adventure-data-created event:
geo-service → Topic: adventure-data-created
├── Consumer 1: web-service (update cache)
├── Consumer 2: notification-service (send push)
└── Consumer 3: analytics-service (track stats)
All services independently read the same event. With RabbitMQ you’d need fanout exchanges and multiple queues to achieve this. With Kafka it’s the default behavior.
Other Kafka Advantages
- Cluster-native scalability — Kafka was designed from the ground up to run across multiple nodes. Adding brokers automatically distributes load. RabbitMQ was designed as a single-node broker with clustering added later.
- Resilience through replication — Topics are replicated across brokers. If one node dies, no data is lost.
- Simpler API —
KafkaTemplateis typed with generics and handles domain objects directly. NoconvertAndSend()needed. - JMS limitation — JMS is Java-only. Kafka supports any language/platform.
2. Migrate from Zookeeper to KRaft Mode (Completed 2026-03-09)
What is Zookeeper?
Zookeeper is a separate service that Kafka historically needed to manage cluster metadata — which brokers are alive, which broker leads which partition, topic configurations, etc.
What is KRaft?
Starting from Kafka 3.3, Kafka can manage its own metadata without Zookeeper using the KRaft (Kafka Raft) protocol. One of the Kafka brokers acts as the metadata controller.
Why Migrate?
| With Zookeeper | With KRaft | |
|---|---|---|
| Processes | 2 (zoo1 + kafka1) | 1 (kafka1 only) |
| RAM savings | ~128-256 MB wasted on Zookeeper JVM | Freed up |
| Complexity | Two services to configure, monitor, debug | One service |
| Single point of failure | Zookeeper dies → Kafka can’t function | No external dependency |
| Future | Deprecated — will be removed in Kafka 4.0 | The standard going forward |
PI1 (Pi 5) already runs 20+ containers at 65% RAM. Removing the zoo1 container saves resources and simplifies the stack.
Gotcha: ACL Authorizer
The Zookeeper-based kafka.security.authorizer.AclAuthorizer crashes in KRaft mode because it tries to connect to Zookeeper. Must use the KRaft-native authorizer:
Before: KAFKA_AUTHORIZER_CLASS_NAME: kafka.security.authorizer.AclAuthorizer
After: KAFKA_AUTHORIZER_CLASS_NAME: org.apache.kafka.metadata.authorizer.StandardAuthorizer
Cluster ID
HLThU0HXQJegC_vT4NGpCw
Generated via docker exec kafka1 kafka-storage random-uuid. This same ID is used by both kafka1 (PI1) and kafka2 (PI2) — stored in env files as KAFKA_CLUSTER_ID.
Migration Steps
Step 1: Generate a cluster ID
docker exec kafka1 kafka-storage random-uuid
Step 2: Update docker-compose-kafka.yml — remove zoo1 service, update kafka1 to KRaft mode (see Section 4 for the actual compose file).
Step 3: Stop old stack, start new:
docker compose --env-file env.pi -f docker-compose-kafka.yml down
docker compose --env-file env.pi -f docker-compose-kafka.yml up -d
Step 4: Verify topics are intact:
docker exec kafka1 kafka-topics --bootstrap-server 192.168.1.199:29092 --list
3. Kafka 2-Broker Cluster (Completed 2026-03-09)
A Kafka cluster is not a separate product or special installation. It’s simply multiple Kafka brokers configured to work together using the same Cluster ID.
Architecture
PI1 (192.168.1.199): kafka1 (node.id=1, broker + controller)
PI2 (192.168.1.105): kafka2 (node.id=2, broker only)
kafka1 acts as both broker and KRaft controller. kafka2 is a broker-only node that connects to kafka1’s controller on port 9093.
Topic Replication
Topic: adventuretube-data (2 partitions, replication-factor=2)
PI1 kafka1: Partition 0 (follower), Partition 1 (leader)
PI2 kafka2: Partition 0 (leader), Partition 1 (follower)
- Leader = handles all reads/writes for that partition
- Follower = keeps a backup copy. If leader dies, follower is promoted
- Every broker is a peer — no master/slave hierarchy
- ISR (In-Sync Replicas) = both brokers are in sync for both partitions
Key Terminology
| Term | Meaning |
|---|---|
| Broker | A single Kafka server process |
| Cluster | A group of brokers working together (same Cluster ID) |
| Partition | A slice of a topic’s data — enables parallel consumption |
| Leader | The broker that handles reads/writes for a partition |
| Follower | A broker that keeps a backup copy of a partition |
| Replication factor | How many copies of each partition exist across brokers |
| ISR | In-Sync Replicas — the set of brokers fully caught up with the leader |
Listener Architecture (Critical for Cross-Host Brokers)
Each broker has multiple listeners for different purposes:
| Listener | Port | Purpose | Advertised As |
|---|---|---|---|
| INTERNAL | 19092 | Inter-broker replication | Host IP (e.g. 192.168.1.199:19092) |
| EXTERNAL | 29092 | Client connections (microservices) | Host IP (e.g. 192.168.1.199:29092) |
| CONTROLLER | 9093 | KRaft metadata (kafka1 only) | PI1 IP only |
Gotcha: Since kafka1 and kafka2 run on separate Docker hosts (PI1 and PI2), the INTERNAL advertised listener must use the host IP, not the container hostname. Using kafka2:19092 as the advertised address would fail because kafka1’s container cannot resolve the kafka2 hostname across Docker hosts. This caused replication to hang until fixed.
Infrastructure
| Pi | Spec | RAM | Current Role | Kafka |
|---|---|---|---|---|
| PI1 | Pi 5 + 1TB SSD | 8 GiB (65% used) | Main server: Kafka, DBs, monitoring, Jenkins, nginx | kafka1 (broker + controller) |
| PI2 | Pi 4 | 8 GiB (29% used, ~5.6 GiB free) | Cloud services + kafka2 | kafka2 (broker only) |
| PI3 | Pi 4 | 2 GiB (50% used, swapping) | Light workloads | Too small for Kafka |
4. Configuration Files
Env Variables (in Jenkins Credentials)
All env files (env.pi, env.pi2, env.mac, env.prod) contain:
# Pi cluster IP addresses
PI1_IP=192.168.1.199
PI2_IP=192.168.1.105
PI3_IP=192.168.1.144
# Kafka
KAFKA_CLUSTER_ID=HLThU0HXQJegC_vT4NGpCw
KAFKA_BOOTSTRAP_SERVERS=192.168.1.199:29092,192.168.1.105:29092
PI1_IP,PI2_IP,PI3_IP— used bydocker-compose-kafka.ymlfor listener addressesKAFKA_CLUSTER_ID— shared cluster identity, used by docker-composeKAFKA_BOOTSTRAP_SERVERS— used by Spring microservices (literal IPs, not${PI1_IP}references, because Spring Boot reads env values directly without shell interpolation)
docker-compose-kafka.yml (PI1 — kafka1 + kafdrop)
services:
kafka1:
image: confluentinc/cp-kafka:7.6.0
hostname: kafka1
container_name: kafka1
ports:
- "9092:9092"
- "29092:29092"
- "9999:9999"
- "9093:9093"
- "19092:19092"
environment:
KAFKA_NODE_ID: 1
KAFKA_PROCESS_ROLES: broker,controller
KAFKA_CONTROLLER_QUORUM_VOTERS: 1@${PI1_IP}:9093
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
CLUSTER_ID: ${KAFKA_CLUSTER_ID}
KAFKA_LISTENERS: INTERNAL://0.0.0.0:19092,EXTERNAL://0.0.0.0:29092,CONTROLLER://0.0.0.0:9093
KAFKA_ADVERTISED_LISTENERS: INTERNAL://${PI1_IP}:19092,EXTERNAL://${CLOUD_IP_ADDRESS}:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_AUTHORIZER_CLASS_NAME: org.apache.kafka.metadata.authorizer.StandardAuthorizer
KAFKA_ALLOW_EVERYONE_IF_NO_ACL_FOUND: "true"
kafdrop:
image: obsidiandynamics/kafdrop
container_name: kafdrop
ports:
- "9000:9000"
environment:
KAFKA_BROKERCONNECT: ${PI1_IP}:19092,${PI2_IP}:19092
depends_on:
- kafka1
docker-compose-kafka.yml (PI2 — kafka2)
services:
kafka2:
image: confluentinc/cp-kafka:7.6.0
hostname: kafka2
container_name: kafka2
ports:
- "9092:9092"
- "29092:29092"
- "9999:9999"
- "19092:19092"
environment:
KAFKA_NODE_ID: 2
KAFKA_PROCESS_ROLES: broker
KAFKA_CONTROLLER_QUORUM_VOTERS: 1@${PI1_IP}:9093
KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
CLUSTER_ID: ${KAFKA_CLUSTER_ID}
KAFKA_LISTENERS: INTERNAL://0.0.0.0:19092,EXTERNAL://0.0.0.0:29092
KAFKA_ADVERTISED_LISTENERS: INTERNAL://${PI2_IP}:19092,EXTERNAL://${PI2_IP}:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
KAFKA_AUTHORIZER_CLASS_NAME: org.apache.kafka.metadata.authorizer.StandardAuthorizer
KAFKA_ALLOW_EVERYONE_IF_NO_ACL_FOUND: "true"
Key differences from kafka1:
KAFKA_NODE_ID: 2— unique per brokerKAFKA_PROCESS_ROLES: broker— not controller (PI1’s kafka1 handles that)KAFKA_CONTROLLER_QUORUM_VOTERSpoints to PI1’s controller- No CONTROLLER listener (only the controller node needs it)
- Same
CLUSTER_ID— this is what makes them part of the same cluster
5. How to Start the 2-Broker Cluster and Verify It Is Working
Step 1: Ensure env files are updated in Jenkins
Both env.pi and env.pi2 in Jenkins Credentials must contain PI1_IP, PI2_IP, PI3_IP, KAFKA_CLUSTER_ID, and KAFKA_BOOTSTRAP_SERVERS. See Section 4 for the exact values.
Step 2: Deploy via Jenkins
Run both Jenkins pipelines to pull the latest main branch. This updates docker-compose-kafka.yml on both Pis via git pull:
- adventuretube-clouds — deploys to PI2 (eureka, config, gateway)
- adventuretube-microservice — deploys to PI2 (auth, member, web, geospatial)
Wait for Spring Cloud services (eureka, config, gateway) to be healthy before proceeding.
Step 3: Start kafka1 on PI1
ssh strider@strider-pi.local "cd ~/adventuretube-microservice && docker compose --env-file env.pi -f docker-compose-kafka.yml up -d"
This starts kafka1 (broker + controller) and kafdrop (web UI).
Step 4: Start kafka2 on PI2
ssh strider@strider-pi2.local "cd ~/adventuretube-microservice && docker compose --env-file env.pi2 -f docker-compose-kafka.yml up -d"
kafka2 connects to kafka1’s controller on port 9093 and joins the cluster.
Step 5: Verify — Both brokers visible
ssh strider@strider-pi.local "docker exec kafka1 kafka-broker-api-versions --bootstrap-server 192.168.1.199:29092 2>&1 | grep '^[0-9]'"
Expected output:
192.168.1.199:29092 (id: 1 rack: null) -> (
192.168.1.105:29092 (id: 2 rack: null) -> (
If only one broker shows, check kafka2 logs: docker logs kafka2
Step 6: Verify — Topic replication
ssh strider@strider-pi.local "docker exec kafka1 kafka-topics --bootstrap-server 192.168.1.199:29092 --describe --topic adventuretube-data"
Expected output:
Topic: adventuretube-data PartitionCount: 2 ReplicationFactor: 2
Partition: 0 Leader: 2 Replicas: 2,1 Isr: 2,1
Partition: 1 Leader: 1 Replicas: 1,2 Isr: 1,2
- Replicas should show both broker IDs (1,2) for each partition
- Isr should match Replicas (both brokers in sync)
- If Isr is missing a broker, replication is behind — check inter-broker connectivity
Step 7: Verify — Kafdrop web UI
Open http://192.168.1.199:9000 (or kafka.adventuretube.net).
You should see:
- 2 brokers listed on the main page
- Click on
adventuretube-datatopic — shows 2 partitions, each with replicas on both brokers - Partition leaders should be distributed across both brokers
Step 8: Verify — Produce and consume a test message
# Produce a test message
ssh strider@strider-pi.local "docker exec kafka1 bash -c 'echo \"test-message\" | kafka-console-producer --bootstrap-server 192.168.1.199:29092 --topic adventuretube-data'"
# Consume from kafka2 to verify cross-broker replication
ssh strider@strider-pi2.local "docker exec kafka2 kafka-console-consumer --bootstrap-server 192.168.1.105:29092 --topic adventuretube-data --from-beginning --max-messages 1"
If the message appears on kafka2, cross-broker replication is working.
Recreating a Topic with Replication
If adventuretube-data was created before the 2-broker cluster (replication factor 1), recreate it:
# Delete old topic
docker exec kafka1 kafka-topics --bootstrap-server 192.168.1.199:29092 --delete --topic adventuretube-data
# Wait a few seconds, then create with replication
docker exec kafka1 kafka-topics --bootstrap-server 192.168.1.199:29092 \
--create --topic adventuretube-data --partitions 2 --replication-factor 2
Warning: If a microservice is connected, Kafka may auto-recreate the topic with replication factor 1 before you can recreate it. If that happens, use kafka-reassign-partitions instead:
# Create reassignment JSON
echo '{"version":1,"partitions":[{"topic":"adventuretube-data","partition":0,"replicas":[1,2]}]}' > /tmp/reassign.json
# Execute reassignment
docker exec kafka1 kafka-reassign-partitions --bootstrap-server 192.168.1.199:29092 \
--reassignment-json-file /tmp/reassign.json --execute
# Verify completion
docker exec kafka1 kafka-reassign-partitions --bootstrap-server 192.168.1.199:29092 \
--reassignment-json-file /tmp/reassign.json --verify
6. Troubleshooting
kafka2 joins but replication hangs (Isr missing broker 2)
Cause: INTERNAL advertised listener uses container hostname (kafka2:19092) instead of host IP. Brokers on different Docker hosts cannot resolve each other’s container hostnames.
Fix: Use host IPs for INTERNAL advertised listeners:
# kafka1 on PI1
KAFKA_ADVERTISED_LISTENERS: INTERNAL://192.168.1.199:19092,...
# kafka2 on PI2
KAFKA_ADVERTISED_LISTENERS: INTERNAL://192.168.1.105:19092,...
And expose port 19092 on both hosts.
Topic auto-recreates after deletion
Cause: A connected microservice or kafka-exporter triggers auto-topic-creation.
Fix: Use kafka-reassign-partitions to change replication factor in-place (see Section 5), or stop all consumers before deleting.
kafka2 logs show “Connection to controller refused”
Cause: Port 9093 not exposed on PI1, or KAFKA_CONTROLLER_QUORUM_VOTERS points to wrong IP.
Fix: Ensure PI1’s docker-compose exposes port 9093 and the voters config uses PI1’s IP.
