The most important thing about observability

Observability is a non-functional requirement (NFR) of an IT system.
The main goals are:

to know if the system works on production
to know if the system works properly
to know if and when the system will break

The idea behind observability is that, if I know that my system will break
tomorrow, I can fix it beforehand. So my users won’t notice any
interruptions.

But the most important thing about observability is automatization of
monitoring. Nobody will watch those system metrics and gauges 24/7. You
need the ability to automate the analysis and alerting. So you get some
warning emails (or SMSes) when something is going to happen.

Think about it, how you can design metrics and automation for your
monitoring. You can start with analyzing all those metrics before outages
(in your post-mortem analysis) and run some anomaly detection on those. For
example, Azure has a great tool for anomaly detection and Azure Monitor has a lot of
observability automation features.

Paweł

The most important thing about observability

Leave a Reply Cancel reply