FireHydrant Tasks provide turn-by-turn navigation during an incident
Centralize your incident management playbooks and enforce consistent practices every time. Give every incident responder turn-by-turn navigation for working through incidents — all without leaving Slack.
By Dylan Nielsen on 7/21/2022
An incident has been declared and your runbook has fired. Everyone is gathered in your Slack channel, the tickets are opened, and roles are assigned. Now what? This is when most teams manually update status pages and kickoff investigation streams using a patchwork of tribal knowledge and supporting playbook documents. Navigating between documents and systems dashboards and manually updating the team on progress requires costly mental aerobics amid the pressure of an incident, and adds up to lost time and confidence.
Today we’re introducing FireHydrant Tasks, bringing the same automation and consistency you expect from Runbooks to the human-centered discovery and mitigation milestones of an incident. FireHydrant Tasks allow you to bring your playbooks for incident discovery and mitigation directly into FireHydrant as pre-defined sets of tasks that incident commanders can easily create, assign, and follow up on during the course of an incident. And, they can even be assigned as post-incident action items.
FireHydrant Tasks allow teams to confidently move through an incident knowing exactly what to do next. This not only lowers cognitive load and reduces mistakes, it improves the consistency of the incident management process across teams. The more of your workflow that can be automated and directly managed alongside your team, the faster your incident can be diagnosed and mitigated.
How FireHydrant Tasks Work
The magic of FireHydrant Tasks is that they flex to serve the nature of an incident: simultaneously predictable and unpredictable. You can pre-configure task lists that get auto-assigned during an incident based on conditions you set or you can add them ad-hoc as you learn more. Let’s walk through an example incident to illustrate how FireHydrant Tasks can work for your team.
An on-call engineer is paged for availability issues on your mobile app. After they kick off the incident, they get automatically assigned as the incident commander and they get the Commander Tasks task list assigned to them through their runbook.
Over the course of triaging this incident, an engineer looking at observability data for the API used by the mobile app noticed a spike in latency that occurred on a regular basis, corresponding to an analytics cronjob that runs every 20 minutes. To restore service, the team decided to disable that non-critical functionality until a fix could be explored in the morning. In order to fully resolve this incident and return to normal operations, that cronjob will need to be reenabled. So, the incident commander creates an ad hoc task and assigns it to themself.
Over the course of the triage, the commander found an anomaly in the CDN latency, that regularly occurring spike in latency that seemed to be a contributing factor. By running
/fh add task-list the commander was able to find a task list for dealing with this specific issue.
After adding this task list to the incident, the incident commander was able to follow the steps provided and remediate the issue.
Get started with FireHydrant Tasks
Taking advantage of FireHydrant Tasks’ automation and process-enforcement can be as simple as building just one task list. We recommend starting with a playbook that you know will be run at every single incident: the incident commander task list. Head to your dashboard and explore the new Tasks Lists functionality under Incident response.
See FireHydrant in action
See how service catalog, incident management, and incident communications come together in a live demo.Get a demo