2026-02-23 · 2 min read
ClickHouse Replication Lag: How to Diagnose & Fix
Replication lag in ClickHouse measures how far a replica is behind its source shard. You can check it instantly with a single query against system.replicas. Lag under 30 seconds is healthy. Over 5 minutes indicates a serious problem. The fix command is SYSTEM SYNC REPLICA database.table.
Detecting Replication Lag
-- Current lag across all replicated tables
SELECT
database,
table,
replica_name,
absolute_delay AS lag_seconds,
queue_size,
inserts_in_queue,
merges_in_queue
FROM system.replicas
WHERE absolute_delay > 0
ORDER BY absolute_delay DESC;Common Causes
1. Network congestion — High network utilization between ClickHouse nodes slows replication throughput.
2. Heavy merge activity — Large background merges consume disk I/O, starving replication of resources.
3. ZooKeeper/Keeper issues — ClickHouse uses ZooKeeper to coordinate replication. A degraded ZooKeeper session causes replication to stall.
4. Replica overloaded — If a replica is serving heavy query load, it may fall behind on replication.
Fixing Replication Lag
-- Force immediate sync (most common fix)
SYSTEM SYNC REPLICA database.table_name;
-- Check the replication queue for stuck operations
SELECT type, create_time, required_quorum, source_replica, parts_to_merge
FROM system.replication_queue
WHERE table = 'table_name'
ORDER BY create_time ASC
LIMIT 20;
-- Restart replication sends if stuck
SYSTEM RESTART REPLICA database.table_name;Monitoring Thresholds
- Green:
absolute_delay < 30seconds - Warning:
absolute_delaybetween 30–300 seconds - Critical:
absolute_delay > 300seconds
Clustersight monitors replication lag continuously and sends a Slack alert with the SYSTEM SYNC REPLICA command pre-filled when lag exceeds your threshold.
Read more: How to Monitor ClickHouse in Production
Frequently Asked Questions
How do I check ClickHouse replication lag?
Query SELECT database, table, absolute_delay FROM system.replicas WHERE absolute_delay > 0 ORDER BY absolute_delay DESC. The absolute_delay column shows lag in seconds.
What is acceptable replication lag in ClickHouse?
Under 30 seconds is healthy. 30–300 seconds is a warning. Over 300 seconds (5 minutes) is a serious problem requiring investigation.
How do I fix replication lag in ClickHouse?
Run SYSTEM SYNC REPLICA database.table to force immediate synchronization. Also check the replication queue in system.replication_queue for stuck operations.