2026-02-24 · 2 min read

ClickHouse Observability Tools Compared: Clustersight vs Grafana vs Datadog vs HyperDX

Most ClickHouse teams start with Grafana. Some graduate to Datadog. A few try HyperDX. Each has genuine strengths and real limitations when it comes to ClickHouse-specific observability.

This hub covers the tradeoffs honestly so you can pick the right tool for your situation.

The Core Problem

ClickHouse has unique internals that general-purpose monitoring tools don't understand: merge queues, part health, ZooKeeper/ClickHouse Keeper dependencies, replication lag by shard, and query-level profiling via system.query_log. A tool that doesn't model these correctly gives you metrics without insight.

Tools Covered

Grafana

Clustersight vs Grafana — Grafana gives you full flexibility but requires significant setup and ongoing maintenance. Good if you have a dedicated infra/observability engineer.

Datadog

Clustersight vs Datadog — Datadog is the enterprise default. ClickHouse coverage is basic — you get process metrics but not ClickHouse-specific internals. Cost scales with cardinality.

HyperDX

Clustersight vs HyperDX — HyperDX is a strong open-source alternative for general observability. ClickHouse-specific depth is limited.

What to Look For

Merge Queue Visibility

ClickHouse merges happen constantly in the background. A backed-up merge queue is one of the most common causes of performance degradation and is invisible in general-purpose tools.

Part Health

Too many parts per partition causes slow INSERTs and reads. Tracking system.parts over time tells you if your ingestion pattern is healthy.

Replication Lag

In replicated setups, lag on system.replication_queue or system.replicas is a leading indicator of problems before they become failures.

Frequently Asked Questions

What is the best tool for monitoring ClickHouse?

It depends on your team. Grafana works if you have someone to maintain dashboards. Datadog covers ClickHouse but costs scale with data volume. Clustersight is purpose-built for ClickHouse with automated health scoring and fix suggestions.

Does Datadog support ClickHouse monitoring?

Datadog has a ClickHouse integration that collects basic metrics. It does not provide ClickHouse-specific insights like merge queue health, part counts, replication lag, or query-level profiling.