
Site Reliability Engineering
How Google Runs Production Systems
Failed to add items
Add to Cart failed.
Add to Wish List failed.
Remove from wishlist failed.
Adding to library failed
Follow podcast failed
Unfollow podcast failed
3 months free
Buy for $30.09
No default payment method selected.
We are sorry. We are not allowed to sell this product with the selected payment method
-
Narrated by:
-
Liz Porter
Newly adapted for audiobook listeners.
The overwhelming majority of a software system’s lifespan is spent in use, not in design or implementation. So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large scale computing systems?
In this collection of essays and articles, key members of Google’s Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. You’ll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficient - lessons directly applicable to your organization.
This book is divided into four sections:
- Introduction - Learn what site reliability engineering is and why it differs from conventional IT industry practices
- Principles - Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE)
- Practices - Understand the theory and practice of an SRE’s day to day work: building and operating large distributed computing systems
- Management - Explore Google's best practices for training, communication, and meetings that your organization can use
Listeners also enjoyed...




















People who viewed this also viewed...


















Excellent perspective and methodology overview for operating complex technical environments
Something went wrong. Please try again in a few minutes.
A great resource and a decent book.
Something went wrong. Please try again in a few minutes.
Treasure trove of knowledge but way too long
Something went wrong. Please try again in a few minutes.
Great book, bad narrator
Something went wrong. Please try again in a few minutes.
Google has a lot of special Google-sauce to make their mono-repo work for them. And sorta assumes everyone has the special Google-sauce.
Therefore, I don't consider most of the organizational advice applicable without modification.
An SRE is really just an Ops person that can program and is encouraged to solve their problems with code and automation.
They first make on-call seem daunting, then says it's a privilege new hires has to earn.
They spend a whole chapter on cron jobs. And make it seem like something magic Google invented....
The narrator is really robotic, maybe the voice of Google translate? It also sounds weirdly nervous at times.
Google propaganda
Something went wrong. Please try again in a few minutes.