Mar 22#observability#sre
Observability Is Not Monitoring
The two words get used interchangeably, but they describe different capabilities. The difference matters most at 3 AM.
Monitoring answers known questions
Monitoring is the dashboard you built because you knew CPU might spike. It's great — until the failure mode is one you never imagined.
Observability answers new questions
Observability is the property that lets you interrogate your system about behavior you didn't predict. It rests on three pillars:
- Metrics — cheap, aggregatable, great for alerting.
- Logs — high-cardinality detail, expensive at scale.
- Traces — the causal story of a single request across services.
histogram_quantile(0.99,
sum(rate(http_request_duration_seconds_bucket[5m])) by (le, route)
)
That p99 query is monitoring. The ability to pivot from a slow trace into the exact span and tag that caused it — that's observability.
The test
If you can only answer questions you anticipated when you instrumented the system, you have monitoring. If you can ask brand-new questions during an incident and get answers, you have observability.