How can we do Kafka Cluster Setup: Guidance on multi-broker clusters, ZooKeeper coordination?

How can we do Kafka Cluster Setup? Guidance on multi-broker clusters and ZooKeeper coordination (though newer versions replace ZooKeeper with KRaft)?



Kafka Cluster Setup: Setting up a Kafka cluster involves choosing between ZooKeeper-based (legacy) and KRaft-based (ZooKeeper-less) architectures. Below is a structured guide for both approaches, including best practices and key configurations.


1. ZooKeeper-Based Cluster Setup

Core Components

  • ZooKeeper Ensemble: Coordinates broker metadata and leader elections (minimum 3 nodes for production).
  • Brokers: Kafka servers handling data storage and client requests.

Configuration Steps

  1. Set Up ZooKeeper:
    • Configure zoo.cfg with server entries:
server.1=zk1:2888:3888
server.2=zk2:2888:3888
server.3=zk3:2888:3888
    • Create a myid file on each node (e.g., 1 for zk1).
  1. Configure Brokers:
    • Set unique broker.id in server.properties:
broker.id=1
listeners=PLAINTEXT://broker1:9092
zookeeper.connect=zk1:2181,zk2:2181,zk3:2181
log.dirs=/var/lib/kafka/logs
    • Repeat with incremental broker.id for each node.
  1. Validate the Cluster:
    • Create a topic with replication factor matching broker count:
kafka-topics --create --bootstrap-server broker1:9092 \
  --topic test --partitions 3 --replication-factor 3
    • Verify replication status with kafka-topics --describe.

2. KRaft-Based Cluster Setup

Core Components

  • Controllers: Manage metadata via the Raft protocol (minimum 3 nodes for quorum).
  • Brokers: Handle data storage and client I/O.

Configuration Steps

  1. Generate a Cluster ID:
    kafka-storage.sh random-uuid
    # Output: ABCDEFGHIJKLMNOPQRSTUV
  2. Configure Controllers:
    • Edit controller.properties:
      process.roles=controller
      node.id=1
      controller.quorum.voters=1@controller1:9093,2@controller2:9093,3@controller3:9093
      listeners=CONTROLLER://controller1:9093
      log.dirs=/var/lib/kafka/controller-logs
    • Repeat for other controllers with unique node.id.
  3. Configure Brokers:
    • Edit server.properties:
      process.roles=broker
      node.id=101
      controller.quorum.voters=1@controller1:9093,2@controller2:9093,3@controller3:9093
      listeners=PLAINTEXT://broker1:9092
      log.dirs=/var/lib/kafka/broker-logs
    • Start brokers with kafka-server-start.
  4. Validate the Cluster:
    • Check metadata quorum health:
      kafka-metadata-quorum --bootstrap-server controller1:9093 describe --status
    • Produce/consume test messages to verify functionality.

3. Key Considerations

Factor ZooKeeper-Based KRaft-Based
Architecture Requires external ZooKeeper ensemble Self-contained; no external dependencies
Scalability Limited by ZooKeeper performance Improved metadata handling and scalability
Production Readiness Deprecated in Kafka 4.0+ Recommended for new deployments (Kafka 3.3+)
Setup Complexity Higher (dual-system management) Simplified (single-system management)

4. Best Practices

  • KRaft for New Deployments: Use KRaft unless legacy dependencies require ZooKeeper.
  • Controller Nodes: Deploy 3+ dedicated controllers in production (avoid combined roles).
  • Network Configuration:
    • Use separate listeners for internal (controller) and external (client) traffic.
    • Enable TLS/SSL for inter-node communication.
  • Monitoring: Track metrics like ActiveControllerCount and MetadataLoggingLag for KRaft clusters.

For migrations from ZooKeeper to KRaft, follow Kafka’s phased approach (backup data, test in staging, and monitor dual-write states). Note that direct upgrades between modes are unsupported.

5. References

About MinervaDB Corporation 60 Articles
A boutique private-label enterprise-class MySQL, MariaDB, MyRocks, PostgreSQL and ClickHouse consulting, 24*7 consultative support and remote DBA services company with core expertise in performance, scalability and high availability. Our consultants have several years of experience in architecting and building web-scale database infrastructure operations for internet properties from diversified verticals like CDN, Mobile Advertising Networks, E-Commerce, Social Media Applications, SaaS, Gaming and Digital Payment Solutions. Our globally distributed team working on multiple timezones guarantee 24*7 Consulting, Support and Remote DBA Services delivery for MySQL, MariaDB, MyRocks, PostgreSQL and ClickHouse.