Kindle Price: $31.34

Save $19.65 (39%)

These promotions will be applied to this item:

Some promotions may be combined; others are not eligible to be combined with other offers. For details, please see the Terms & Conditions associated with these promotions.

eBook features:
  • Highlight, take notes, and search in the book
You've subscribed to ! We will preorder your items within 24 hours of when they become available. When new books are released, we'll charge your default payment method for the lowest price available during the pre-order period.
Update your device or payment method, cancel individual pre-orders or your subscription at
Your Memberships & Subscriptions

Buy for others

Give as a gift or purchase for a team or group.
Learn more

Buying and sending eBooks to others

  1. Select quantity
  2. Buy and send eBooks
  3. Recipients can read on any device

These ebooks can only be redeemed by recipients in the US. Redemption links and eBooks cannot be resold.

Kindle app logo image

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.

Read instantly on your browser with Kindle for Web.

Using your mobile phone camera - scan the code below and download the Kindle app.

QR code to download the Kindle App

Loading your book clubs
There was a problem loading your book clubs. Please try again.
Not in a club? Learn more
Amazon book clubs early access

Join or create book clubs

Choose books together

Track your books
Bring your club to Amazon Book Clubs, start a new book club and invite your friends to join, or find a club that’s right for you for free.

Follow the authors

See all
Something went wrong. Please try your request again later.

Site Reliability Engineering: How Google Runs Production Systems 1st Edition, Kindle Edition

4.7 4.7 out of 5 stars 1,106 ratings

The overwhelming majority of a software systemâ??s lifespan is spent in use, not in design or implementation. So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large-scale computing systems?

In this collection of essays and articles, key members of Googleâ??s Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. Youâ??ll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficientâ??lessons directly applicable to your organization.

This book is divided into four sections:

  • Introductionâ??Learn what site reliability engineering is and why it differs from conventional IT industry practices
  • Principlesâ??Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE)
  • Practicesâ??Understand the theory and practice of an SREâ??s day-to-day work: building and operating large distributed computing systems
  • Managementâ??Explore Google's best practices for training, communication, and meetings that your organization can use

About the Author

Niall Murphy leads the Ads Site Reliability Engineering team at Google Ireland. He has been involved in the Internet industry for about 20 years, and is currently chairperson of INEX, Ireland’s peering hub. He is the author or coauthor of a number of technical papers and/or books, including "IPv6 Network Administration" for O’Reilly, and a number of RFCs. He is currently cowriting a history of the Internet in Ireland, and is the holder of degrees in Computer Science, Mathematics, and Poetry Studies, which is surely some kind of mistake. He lives in Dublin with his wife and two sons.

^

Betsy Beyer is a Technical Writer for Google Site Reliability Engineering in NYC. She has previously written documentation for Google Datacenters and Hardware Operations teams. Before moving to New York, Betsy was a lecturer on technical writing at Stanford University.

^

Chris Jones is a Site Reliability Engineer for Google App Engine, a cloud platform-as-a-service product serving over 28 billion requests per day. Based in San Francisco, he has previously been responsible for the care and feeding of Google’s advertising statistics, data warehousing, and customer support systems. In other lives, Chris has worked in academic IT, analyzed data for political campaigns, and engaged in some light BSD kernel hacking, picking up degrees in Computer Engineering, Economics, and Technology Policy along the way. He’s also a licensed professional engineer.

^

Jennifer Petoff is a Program Manager for Google’s Site Reliability Engineering team and based in Dublin, Ireland. She has managed large global projects across wide-ranging domains including scientific research, engineering, human resources, and advertising operations. Jennifer joined Google after spending eight years in the chemical industry. She holds a PhD in Chemistry from Stanford University and a BS in Chemistry and a BA in Psychology from the University of Rochester.

--This text refers to the paperback edition.
Due to its large file size, this book may take longer to download

From the brand


From the Publisher


This book is divided into four sections:
  • Introduction—Learn what site reliability engineering is and why it differs from conventional IT industry practices
  • Principles—Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE)
  • Practices—Understand the theory and practice of an SRE’s day-to-day work: building and operating large distributed computing systems
  • Management—Explore Google's best practices for training, communication, and meetings that your organization can use

How to Read This Book

This book is a series of essays written by members and alumni of Google’s Site Reliability Engineering organization. It’s much more like conference proceedings than it is like a standard book by an author or a small number of authors. Each chapter is intended to be read as a part of a coherent whole, but a good deal can be gained by reading on whatever subject particularly interests you. (If there are other articles that support or inform the text, we reference them so you can follow up accordingly.)

You don’t need to read in any particular order, though we’d suggest at least starting with Chapters 2 and 3, which describe Google’s production environment and outline how SRE approaches risk, respectively. (Risk is, in many ways, the key quality of our profession.) Reading cover-to-cover is, of course, also useful and possible; our chapters are grouped thematically, into Principles (Part II), Practices (Part III), and Management (Part IV). Each has a small introduction that highlights what the individual pieces are about, and references other articles published by Google SREs, covering specific topics in more detail. Additionally, there’s a companion website mentioned in the book that has a number of helpful resources.

We hope this will be at least as useful and interesting to you as putting it together was for us.

— The Editors.

Site Reliability Engineering: How Google Runs Production Systems The Site Reliability Workbook: Practical Ways to Implement SRE
Site Reliability Engineering The Site Reliability Workbook
Explore the book & companion workbook How Google Runs Production Systems Practical Ways to Implement SRE

Product details

  • ASIN ‏ : ‎ B01DCPXKZ6
  • Publisher ‏ : ‎ O'Reilly Media; 1st edition (March 23, 2016)
  • Publication date ‏ : ‎ March 23, 2016
  • Language ‏ : ‎ English
  • File size ‏ : ‎ 12186 KB
  • Simultaneous device usage ‏ : ‎ Unlimited
  • Text-to-Speech ‏ : ‎ Enabled
  • Screen Reader ‏ : ‎ Supported
  • Enhanced typesetting ‏ : ‎ Enabled
  • X-Ray ‏ : ‎ Not Enabled
  • Word Wise ‏ : ‎ Enabled
  • Sticky notes ‏ : ‎ Not Enabled
  • Print length ‏ : ‎ 866 pages
  • Customer Reviews:
    4.7 4.7 out of 5 stars 1,106 ratings

About the authors

Follow authors to get new release updates, plus improved recommendations.

Customer reviews

4.7 out of 5 stars
4.7 out of 5
1,106 global ratings
Kindle edition is horribly formatted
1 Star
Kindle edition is horribly formatted
The Kindle edition is horribly formatted. Headings, subheadings and call-outs are not visible as such, but appear as normal body text. Unacceptable for such an expensive publication.Additional information: this is on Windows 10, with the Kindle reader app from the Windows Store. Attached screenshot shows how the author and editor credits, as well as a quote, appear as normal body text at the start of a chapter: this problem persists through the entire publication, and is especially annoying for subheadings. All my other Kindle books look fine using the same reader app.
Thank you for your feedback
Sorry, there was an error
Sorry we couldn't load the review

Top reviews from the United States

Reviewed in the United States on January 8, 2021
One person found this helpful
Report
Reviewed in the United States on April 20, 2019
7 people found this helpful
Report
Reviewed in the United States on April 1, 2022
Reviewed in the United States on December 22, 2018
41 people found this helpful
Report
Reviewed in the United States on June 26, 2016
16 people found this helpful
Report
Reviewed in the United States on December 27, 2022
Reviewed in the United States on October 30, 2022
Reviewed in the United States on June 10, 2019
2 people found this helpful
Report

Top reviews from other countries

Marcos Ribeiro Pereira Barretto
5.0 out of 5 stars Best of breed
Reviewed in Brazil on March 5, 2022
V
5.0 out of 5 stars The quality is amazing
Reviewed in India on December 12, 2023
Petronald
5.0 out of 5 stars great print quality
Reviewed in the United Kingdom on July 31, 2022
Paul G.
5.0 out of 5 stars Five Stars
Reviewed in Canada on January 3, 2018
devnull
5.0 out of 5 stars A must read book
Reviewed in France on August 20, 2018
Report an issue

Does this item contain inappropriate content?
Do you believe that this item violates a copyright?
Does this item contain quality or formatting issues?