Core Concepts: Topics, Partitions & Brokers
Apache Kafka: Core Concepts Apache Kafka is a distributed event streaming platform. Producers publish records to topics; consumers read from topics. Kafka store…
Apache Kafka: Core Concepts
Apache Kafka is a distributed event streaming platform. Producers publish records to topics; consumers read from topics. Kafka stores records durably on disk with configurable retention. Throughput scales horizontally via partitions.
Architecture
Core components:
Broker
- A single Kafka server
- A cluster has 3+ brokers (for replication + HA)
- One broker is the controller (manages partition leadership)
- KRaft mode (Kafka 2.8+): replaces ZooKeeper with built-in Raft consensus
Topic
- Named stream of records (like a DB table or log file category)
- Partitioned for parallelism
- Retained for a configurable time (default: 7 days) regardless of consumption
Partition
- Ordered, immutable sequence of records
- Each partition stored on one broker (leader) + replicated to N-1 others (followers)
- Records get an incrementing offset within the partition
- Ordering guaranteed within a partition, NOT across partitions
Replication
- Replication factor: how many copies (typical: 3)
- In-Sync Replicas (ISR): replicas that are caught up with the leader
- Leader handles all reads/writes; followers replicate asynchronously
- If leader fails, a new leader is elected from ISR
ZooKeeper (legacy) / KRaft (modern)
- Manages broker metadata, controller election
- KRaft: Kafka 3.3+ can run without ZooKeeper (use KRaft in production from 3.3+)CLI: Topics & Messages
# Start (Docker Compose — easiest for dev)
# docker-compose.yml with confluentinc/cp-kafka or apache/kafka
# Create topic
kafka-topics.sh --create --bootstrap-server localhost:9092 --topic orders --partitions 6 --replication-factor 3 --config retention.ms=604800000 # 7 days
# List topics
kafka-topics.sh --list --bootstrap-server localhost:9092
# Describe topic (partition leaders, ISR)
kafka-topics.sh --describe --topic orders --bootstrap-server localhost:9092
# Delete topic
kafka-topics.sh --delete --topic orders --bootstrap-server localhost:9092
# Alter topic (add partitions — can only increase, never decrease)
kafka-topics.sh --alter --topic orders --partitions 12 --bootstrap-server localhost:9092
# Produce messages (CLI)
kafka-console-producer.sh --bootstrap-server localhost:9092 --topic orders --property key.separator=: --property parse.key=true
# Type: user-123:{"orderId": 1, "amount": 99.99}
# Consume messages (CLI)
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic orders --from-beginning --property print.key=true --property print.timestamp=true
# Consumer group CLI
kafka-consumer-groups.sh --list --bootstrap-server localhost:9092
kafka-consumer-groups.sh --describe --group my-group --bootstrap-server localhost:9092
kafka-consumer-groups.sh --reset-offsets --group my-group --topic orders --to-earliest --execute --bootstrap-server localhost:9092Key Concepts
Offset: unique ID of a record within a partition. Consumers commit offsets to track progress.
Consumer group: set of consumers sharing a topic. Each partition assigned to exactly one consumer in the group.
Throughput scales with partitions: more partitions = more consumer instances can work in parallel.
Log compaction: instead of time-based retention, keep only the latest record per key (useful for state/changelog topics).
Producer acks: acks=0 (fire-and-forget), acks=1 (leader confirmed), acks=all (all ISR confirmed — strongest).
Exactly-once semantics (EOS): enable with enable.idempotence=true on producer + transactional APIs.
KIP-500 / KRaft: Kafka 3.3+ no longer requires ZooKeeper — use KRaft mode for new deployments.