Understanding system behavior through logs, metrics, and traces
Observability is the ability to understand what's happening inside a system by examining its outputs: logs, metrics, and traces (the three pillars). Unlike monitoring (which checks known failure modes), observability enables investigating unknown-unknowns and debugging novel problems in production. Modern distributed systems require observability to understand request flows across services, identify performance bottlenecks, and detect anomalies before users are impacted.