How about the angle of “value to the business”? It doesn’t matter if you are monitoring infrastructure, real-time performance or end-user experience – or all three – if the business is not deriving a value from that investment. DevOps especially is really about software/service quality. Monitoring is just a means of getting there.
So does infrastructure monitoring help me improve software/service quality? Nope. Just lets me measure the size (and frequency) of the crater. A lot of folks cluster applications not to get performance but because they can’t fix the memory leaks or other instabilities. So the gap is in getting code fixed – not measuring the craters!
So does real-time performance monitoring improve software/service quality? Nope. I get to see what is breaking, in great detail. But I don’t have a business-impact context to let me prioritize where I should be spending my spare software-fix budget. So I throw a few more instances into the mix… and hope I meet my SLA.
So does end-user-experience improve software quality? Nope. I finally get to see the business context, specifically which transactions and what business impact – but I don’t get much of a clue to what is broken – other than it is transaction #47… and it’s costing us a ton!
To make sense, APM needs all of the info. So let’s throw in network traffic monitoring (different from infrastructure) and logs (for those apps we can’t instrument or are using RPCs, whatever). And hopefully I’ll have visibility into what needs to get fixed. But then you have to have processes on which to employ all these measurement, and when you catch problems, you have a mechanism to correct them. And this is something best done before you even hit production – but that’s a lesson for another day.