The alert fatigue dilemma: A call for change in how we manage on-call
It's time to address the elephant in the room: alerting tools have become nothing more than over-priced pagers, drowning you in a sea of notifications that may or may not be urgent — and it's taking a toll.
By Robert Ross on 1/18/2024
Once the unsung heroes of the digital realm, engineers are now caught in a cycle of perpetual interruptions thanks to alerting systems that haven't kept pace with evolving needs. A constant stream of notifications has turned on-call duty into a source of frustration, stress, and poor work-life balance.
In 2021, 83% percent of software engineers surveyed reported feelings of burnout from high workloads, inefficient processes, and unclear goals and targets. Add in an economically fractured environment, widespread layoffs, and a shortage of engineers in general, and things haven’t gotten better.
There’s the sheer amount of alerts, of course. The State of SRE in 2024 Report said 71% of SREs report responding to “dozens or hundreds of non-ticketed” incidents a month. But a major issue is that when so many alerts end up being false alarms, it’s easy to either desensitize and check out or spin your wheels trying to figure out what’s “real.”
In fact, International Data Corporation (IDC) reported that among companies with 500-1,499 employees, a whopping 27% of alerts are ignored or simply not investigated. Not only is this dangerous for the company, but it’s also dangerous for our people. The State of Burnout in Tech found that of the 30,000 IT professionals surveyed across 33 countries, 56% of men and 69% of women can’t relax once their workday ends.
Alert fatigue has become the unfortunate side effect of our always-on, interconnected world. Modern alerting tools that should help solve these problems have devolved into nothing more than costly pagers. Engineers are left grappling with unintelligent filtering and its aftermath of an endless barrage of pages, false positives, and unclear alerts. The status quo is no longer sustainable, and it's time for a paradigm shift in on-call management.
A call for change
In this ever-evolving digital landscape, on-call engineers deserve better. They need tools that understand the nuances of incident response and prioritize critical issues over noise. The future of on-call management lies in intelligent alerting systems that use conditional rules to adapt and refine thresholds, reducing alert fatigue and allowing engineers to focus on what truly matters.
Compatibility with life
Shit happens. We get sick, our kids need us, our cars have to go to the shop, the internet guy is coming, you just need a day off — that’s life. And on-call should be compatible with it.
It is too hard to get coverage when you need it in current alerting tools, just compounding stress. Even the word override is stressful! In what other industry do we say, “Can I get someone to override my shift?” We need to normalize asking for coverage when life gets in the way, and our tools should help us.
We should be able to easily schedule and manage on-call rotations right from within the tools we use daily, like Slack. The future of incident alerting means no more hiccups when someone needs to step away from their desk or switch shifts.
Clean data, stronger insights
We also need a clear separation between alerts and incidents. This will allow us to accurately measure metrics like alert-to-noise ratio and mean time to detect, giving us a deeper understanding of our systems. It’ll also enable engineers to work on the projects that matter to them, rather than spending valuable time chasing false alarms.
Having clear-as-day analytics can empower teams to have data-backed discussions about which alerts they need and which they can drop entirely. That means happier on-call teams and faster assembly time.
In today's economic climate, where CFOs are closely monitoring every penny spent, it’s only a matter of time before the price of alerting tools comes under scrutiny. Of course, many of us have scratched our heads for years, wondering why these tools come with such a hefty price tag. It's not just a matter of budgeting; it's about value for money. And when we're talking about the deficiencies above, it's just not there.
We need alerting tools that not only deliver exceptional performance but are also cost-efficient. The future of incident alerting should prioritize affordability without compromising quality. The way forward is active bucket pricing, so you only pay for users that are paged. Alerting is a must-have for most businesses building software in 2024, but they shouldn’t have to pay for hundreds (or even thousands!) of seats that are never used.
Embrace a new world
The International SOS Risk Outlook Report 2024 reported that 80% of surveyed risk professionals predict burnout will significantly impact businesses in the next year — but only 41% of them feel their companies are prepared to deal with it. It’s time to change that.
It's time for a new world of on-call management that prioritizes engineers' well-being and efficiency. Let's bid farewell to this era of alerting tools and welcome a future where engineers are empowered to excel in their roles.
To all the on-call heroes out there, it's time for change. It's time for a tool that understands your needs, respects your time, and empowers you to shine. The revolution starts now.
See FireHydrant in action
See how service catalog, incident management, and incident communications come together in a live demo.Get a demo