Incident Command Center Overview
Every incident on FireHydrant has a home which we call the Command Center. This page is where all information and activity about the incident is collected, and can be the place users conduct incidents if for some reason Slack is down or otherwise unavailable.
How to get to the Command Center
After you've started an incident, the link to the Command Center is available from multiple locations:
As a bookmark in the incident's Slack channel, if the channel has been created
As a link on a notification message in Slack, if a notification has been posted
On the Incidents page under Active Incidents
The Command Center also available in a variety of other locations including Service Catalog, the Dashboard, and more. It should not be difficult at all to get to from anywhere else in the app.
The Command Center is split into two sections: the details panel on the right side, and the main section which takes up the majority of the page.
The Details panel shows the high-level details of the incident:
- Milestone is the current status of the incident.
- Severity determines how major the incident is.
- Description is a general description for the incident. Response teams generally use it to give a brief overview of the incident for themselves, but it can be used for other purposes via Liquid templating.
- Links show any external links like Slack, Jira tickets, Status pages, or more.
- External Links are arbitrary links which users can attach to any incident at will.
- Response Team shows which team members were involved in the incident and their assigned roles, if relevant.
- Impact denotes which Catalog Items are impacted during the incident.
- Customer Support Issues shows any linked support tickets.
- Tags and Labels allow you to track and organize custom data about your incidents.
The main section of the page is split in multiple tabs for various different purposes.
The Incident Timeline is a running timeline of all events that have occurred throughout the incident. Things we track include:
- Runbook steps executing
- Users performing actions like posting notes, updating task completion, etc.
- Any messages or attachments posted to the Slack channel
The timeline can be filtered to specific types of events too, if you're looking for something in particular.
Most importantly, you can Star events from both Slack as well as the app UI. Starring an event marks it as "important" and allows you to comment on it, and it also becomes a primary highlight during the Retrospective phase of an incident.
The Tasks tab shows all of the Tasks and Follow-Ups that have been added to the incident as well as who they've been assigned to.
You can directly manage Tasks and Follow-Ups from this page as well as from Slack.
The Status Pages tab shows all attached (active) status pages for the incident. By default, FireHydrant won't post automatically to a Statuspage, but you can automate this via Runbooks.
FireHydrant also allows you to directly post to your status page(s) from Slack as well.
The Runbooks tab shows all attached Runbooks, their steps, and the statuses of each step.
This allows you to see which Runbooks are running on this particular incident as well as if any steps errored or executed successfully. This is useful for both keeping tabs on each incident's automation as well as debugging.
The Linked Alerts tab shows any alerts linked to this incident from your alerting provider. If you use Alert Routing to create incidents on FireHydrant, then the corresponding alert will automatically be attached to the incident.
The final tab, Change Events, showcases any recent changes to your system that FireHydrant automatically associated with the incident because of the impacted Catalog items.
Alongside out-of-box integrations for GitHub and Kubernetes, FireHydrant has both a robust API as well as a CLI tool that allows you to automate logging changes to your systems from various other sources.
Some examples include in Continuous Integration workflows as well as serverless function webhooks upon detecting infrastructure changes.
Associating change events with incidents can potentially help your team identify contributing factors for the incident faster.
Now that you've gotten an overview of the Command Center, you can learn more about FireHydrant by reading in greater detail about various aspects of incidents: