What is SRE (site reliability engineer)?

by: Alexandre Couedelo
May 17, 2022

Original Article: Original Article

Summary

This article explains the role of a Site Reliability Engineer (SRE) and its relation to DevOps. I learned about the responsibilities and daily routine of an SRE, and the importance of SLOs (Service Level Objectives) in ensuring reliable software delivery. The article also gave insights into the collaboration between development and operations teams in the SRE model and how it contributes to maintaining a high-quality production environment. The SRE model was created at Google in 2003. The one and only “official” responsibility of an SRE is to ensure that the system is reliable, meaning that the company meets its objectives translated in SLA (service level agreement).

Key concepts

SREs primary responsibility is to ensure system reliability, meeting objectives defined in SLAs.
SREs use SLOs and SLIs to measure system performance.
SREs spend time improving tooling and monitoring systems.
SREs act as gatekeepers for production environments and participate early in product development.
The SRE model differs from DevOps by separating development and operations while fostering collaboration.
SREs translate technical data into business insights and manage the software supply chain