Key Concepts
Consumer Group
A logical grouping of consumers that work together to consume messages from topics. Each partition is assigned to one consumer in the group.
Offset
A position marker indicating which messages have been consumed. Kafka stores offsets per partition per consumer group.
Lag
The difference between the latest message in a partition and the last consumed message. High lag indicates consumers are falling behind.
Coordinator
A broker responsible for managing the consumer group’s membership and partition assignments.
Required Permissions
| Action | Permission |
|---|---|
| View consumer groups | iam:project:infrastructure:kafka:read |
| Reset offsets | iam:project:infrastructure:kafka:write |
| Delete consumer groups | iam:project:infrastructure:kafka:delete |
Consumer Group States
| State | Description |
|---|---|
| Stable | Group is active with assigned partitions. Members are consuming normally. |
| Empty | Group exists but has no active members. Offsets are still retained. |
| PreparingRebalance | Group is preparing to reassign partitions due to member changes. |
| CompletingRebalance | Group is finalizing partition assignments after rebalance. |
| Dead | Group has been deleted or expired due to inactivity. |
Groups in Stable state cannot be deleted or have their offsets reset. Stop all consumers first.
How to View Consumer Groups
Filter by State
Use the state filter to show only groups in specific states (Stable, Empty, Rebalancing, Dead).
How to View Consumer Group Details
The detail page provides three tabs:Overview Tab
Shows subscribed topics and general group information:- Group ID
- Current state
- Member count
- Coordinator broker
Members Tab
Lists all active consumers in the group:- Client ID: Application identifier
- Client Host: IP address or hostname
- Member ID: Unique member identifier assigned by Kafka
- Assigned Partitions: Topic:partition pairs assigned to this member
Offsets Tab
Shows consumption progress for each partition:- Current Offset: Last committed offset
- Log End Offset: Latest message position in the partition
- Lag: Messages waiting to be consumed
Understanding Lag
Lag indicates how far behind a consumer group is from the latest messages.| Lag Level | Meaning | Action |
|---|---|---|
| 0-1,000 | Healthy | Consumers are keeping up |
| 1,000-10,000 | Warning | Monitor closely, may need attention |
| > 10,000 | Critical | Consumers falling behind significantly |
Common Causes of High Lag
- Consumer processing is too slow
- Not enough consumers for the number of partitions
- Consumer application errors causing reprocessing
- Network issues between consumers and brokers
- Unbalanced partition distribution
How to Reset Consumer Offsets
Resetting offsets allows you to reprocess messages or skip ahead.Choose Reset Method
Select how to reset offsets:
- Earliest: Reprocess all available messages from the beginning
- Latest: Skip to the end, only consume new messages
- Timestamp: Reset to a specific point in time
- Offset: Reset to a specific offset number
Reset Types
| Type | Use Case |
|---|---|
| Earliest | Reprocess all data (disaster recovery, data replay) |
| Latest | Skip backlog, only process new messages |
| Timestamp | Replay from a specific point in time |
| Offset | Precise control for specific partition offsets |
How to Delete a Consumer Group
Troubleshooting
Consumer group stuck in rebalancing
Consumer group stuck in rebalancing
- A consumer may be taking too long to process messages
- Check for consumer application errors or crashes
- Increase
session.timeout.msandheartbeat.interval.ms - Reduce
max.poll.recordsif processing is slow - Check network connectivity between consumers and brokers
Consumer lag keeps increasing
Consumer lag keeps increasing
- Add more consumer instances (up to partition count)
- Optimize message processing in your application
- Increase partition count to enable more parallelism
- Check if consumers are committing offsets correctly
- Verify no consumer application errors
Cannot delete consumer group
Cannot delete consumer group
- Consumer group must be Empty or Dead, not Stable
- Stop all consumer applications first
- Wait for heartbeat timeout (default 10 seconds)
- Verify no lingering consumer connections
Cannot reset offsets
Cannot reset offsets
- Consumer group must not be in Stable state
- Stop all consumers before resetting
- You need write permission
- Verify the topic and partitions exist
Offsets show as 0 or negative
Offsets show as 0 or negative
- Consumer group may not have committed offsets yet
- Consumer may be using manual offset management
- Topic may have been recreated
- Check if
enable.auto.commitis configured correctly
Consumer group appears and disappears
Consumer group appears and disappears
- Consumers may be using dynamic group membership
- Short
session.timeout.mscauses frequent disconnects - Application may be crashing and restarting
- Check for resource constraints (memory, CPU) on consumer hosts
Partitions not evenly distributed
Partitions not evenly distributed
- Default partition assignment strategies may not balance evenly
- Consider using
StickyAssignororCooperativeStickyAssignor - Ensure consumer count doesn’t exceed partition count
- Check if some consumers have different subscription patterns
FAQ
How many consumers should I have?
How many consumers should I have?
Maximum useful consumers equals the number of partitions across all subscribed topics. Extra consumers remain idle. For high availability, use N-1 consumers where N is partition count, so one consumer can handle failover.
What happens when a consumer crashes?
What happens when a consumer crashes?
After
session.timeout.ms (default 10 seconds), the coordinator marks the consumer as dead and triggers a rebalance. Partitions are reassigned to remaining consumers. Some messages may be reprocessed depending on when offsets were last committed.Should I use auto-commit or manual commit?
Should I use auto-commit or manual commit?
Auto-commit is simpler but may lose messages if the consumer crashes after processing but before commit. Manual commit gives precise control but requires careful implementation. Use manual commit for exactly-once semantics.
What is a rebalance and why does it happen?
What is a rebalance and why does it happen?
A rebalance redistributes partitions among consumers. It happens when: (1) Consumer joins or leaves, (2) Consumer fails heartbeat, (3) Topic partition count changes, (4) Subscription changes. During rebalance, consumption pauses briefly.
How long are offsets retained?
How long are offsets retained?
Controlled by broker config
offsets.retention.minutes (default 7 days). If a consumer group is inactive for longer than this period, all offsets are deleted. Consumers will restart from auto.offset.reset position.Can I have multiple consumer groups reading the same topic?
Can I have multiple consumer groups reading the same topic?
Yes. Each consumer group maintains its own offsets independently. This is useful for different applications processing the same data stream, or for fan-out patterns where multiple systems need the same messages.
What's the difference between lag and consumer lag?
What's the difference between lag and consumer lag?
Lag typically refers to the offset difference (messages behind). Consumer lag may also refer to time-based lag (how old the unprocessed messages are). Both indicate processing delays but from different perspectives.
How do I handle message reprocessing after offset reset?
How do I handle message reprocessing after offset reset?
Implement idempotent processing: (1) Use unique message IDs to detect duplicates, (2) Use database transactions with deduplication, (3) Store processed message IDs in a cache or database, (4) Design operations to be safe when repeated.
Best Practices
Consumer Configuration
Monitoring Strategy
Monitor these metrics continuously:- Consumer lag per partition (should be stable or decreasing)
- Rebalance frequency (frequent rebalances indicate issues)
- Commit rate (should match processing rate)
- Consumer group state (should be Stable during normal operation)
High Availability
- Run at least 2 consumers per group for failover
- Don’t exceed partition count with consumers
- Use rack-aware consumer assignment in multi-datacenter setups
- Configure appropriate timeouts for your network environment
Offset Management
- Commit offsets after successful processing, not before
- Use synchronous commits for critical data
- Consider storing offsets externally for exactly-once semantics
- Implement dead letter queues for failed messages