Getting Started

SRE101Getting Started

What is SRE?

Site Reliability Engineering (SRE) is a practice for managing the reliability of systems. Google originally developed SRE in the early-2000s when Ben Treynor Sloss started the first SRE team, coined the name, and set the tone for the industry.

CultureTeam BuildingHow ToGetting Started

How To Create a Culture of Accountability in an Engineering Organization

A Culture of Accountability is one where the whole team understands they’re working towards a common goal to help the organization succeed then proactively works to deliver value on behalf of your organization, and pivots to help fix mistakes as they occur.

How ToGetting StartedIncident ResponseSRE

How To Categorize the Impact of an Incident

Everything is going well until you get an alert that there has been a system outage. In this article, we define an incident and share how to categorize an incident's impact.