How can we do Kafka Cluster Setup? Guidance on multi-broker clusters and ZooKeeper coordination (though newer versions replace ZooKeeper with KRaft)?
Kafka Cluster Setup: Setting up a Kafka cluster involves choosing between ZooKeeper-based (legacy) and KRaft-based (ZooKeeper-less) architectures. Below is a structured guide for both approaches, including best practices and key configurations.
1. ZooKeeper-Based Cluster Setup
Core Components
- ZooKeeper Ensemble: Coordinates broker metadata and leader elections (minimum 3 nodes for production).
- Brokers: Kafka servers handling data storage and client requests.
Configuration Steps
- Set Up ZooKeeper:
- Configure
zoo.cfg
with server entries:
- Configure
server.1=zk1:2888:3888 server.2=zk2:2888:3888 server.3=zk3:2888:3888
-
- Create a
myid
file on each node (e.g.,1
for zk1).
- Create a
- Configure Brokers:
- Set unique
broker.id
inserver.properties
:
- Set unique
broker.id=1 listeners=PLAINTEXT://broker1:9092 zookeeper.connect=zk1:2181,zk2:2181,zk3:2181 log.dirs=/var/lib/kafka/logs
-
- Repeat with incremental
broker.id
for each node.
- Repeat with incremental
- Validate the Cluster:
- Create a topic with replication factor matching broker count:
kafka-topics --create --bootstrap-server broker1:9092 \ --topic test --partitions 3 --replication-factor 3
-
- Verify replication status with
kafka-topics --describe
.
- Verify replication status with
2. KRaft-Based Cluster Setup
Core Components
- Controllers: Manage metadata via the Raft protocol (minimum 3 nodes for quorum).
- Brokers: Handle data storage and client I/O.
Configuration Steps
- Generate a Cluster ID:
kafka-storage.sh random-uuid # Output: ABCDEFGHIJKLMNOPQRSTUV
- Configure Controllers:
- Edit
controller.properties
:process.roles=controller node.id=1 controller.quorum.voters=1@controller1:9093,2@controller2:9093,3@controller3:9093 listeners=CONTROLLER://controller1:9093 log.dirs=/var/lib/kafka/controller-logs
- Repeat for other controllers with unique
node.id
.
- Edit
- Configure Brokers:
- Edit
server.properties
:process.roles=broker node.id=101 controller.quorum.voters=1@controller1:9093,2@controller2:9093,3@controller3:9093 listeners=PLAINTEXT://broker1:9092 log.dirs=/var/lib/kafka/broker-logs
- Start brokers with
kafka-server-start
.
- Edit
- Validate the Cluster:
- Check metadata quorum health:
kafka-metadata-quorum --bootstrap-server controller1:9093 describe --status
- Produce/consume test messages to verify functionality.
- Check metadata quorum health:
3. Key Considerations
Factor | ZooKeeper-Based | KRaft-Based |
---|---|---|
Architecture | Requires external ZooKeeper ensemble | Self-contained; no external dependencies |
Scalability | Limited by ZooKeeper performance | Improved metadata handling and scalability |
Production Readiness | Deprecated in Kafka 4.0+ | Recommended for new deployments (Kafka 3.3+) |
Setup Complexity | Higher (dual-system management) | Simplified (single-system management) |
4. Best Practices
- KRaft for New Deployments: Use KRaft unless legacy dependencies require ZooKeeper.
- Controller Nodes: Deploy 3+ dedicated controllers in production (avoid combined roles).
- Network Configuration:
- Use separate listeners for internal (controller) and external (client) traffic.
- Enable TLS/SSL for inter-node communication.
- Monitoring: Track metrics like
ActiveControllerCount
andMetadataLoggingLag
for KRaft clusters.
For migrations from ZooKeeper to KRaft, follow Kafka’s phased approach (backup data, test in staging, and monitor dual-write states). Note that direct upgrades between modes are unsupported.