Shop the new tech.book(store)
New! Introducing the tech.book(store), a hub for Software Developers and Architects, Networking Administrators, TPMs, and other technology professionals to find highly-rated and highly-relevant career resources. Shop books on programming and big data, or read this week's blog posts by authors and thought-leaders in the tech industry.
> Shop now
John Allspaw is currently Operations Engineering Manager at Flickr, the popular photo site. He has had extensive experience working with growing web sites since 1999. These include online news magazines (Salon.com, InfoWorld.com, Macworld.com) and social networking sites that experienced extreme growth (Friendster and Flickr). During his time at Friendster, traffic increased 5X. He was responsible for their transition from a couple dozen servers in a failing data center to over 400 machines across two data centers, and the complete redesign of the backing infrastructure. When he joined Flickr, they had 10 servers in a tiny data center in Vancouver; they are now located in multiple data centers across the US. Prior to his web experience, Allspaw worked in modeling and simulation as a mechanical engineer doing car crash simulations for the NHTSA.
John has worked in systems operations for over fourteen years in biotech, government and online media. He started out tuning parallel clusters running vehicle crash simulations for the U.S. government, and then moved on to the Internet in 1997. He built the backing infrastructures at Salon.com, InfoWorld.com, Friendster, and Flickr. He is now VP of Tech Operations at Etsy, and is the author of "The Art of Capacity Planning" and "Web Operations" published by O'Reilly. He speaks from time to time at conferences on topics related to web operations, operations and development culture, infrastructure, and capacity planning.
John Allspaw has done something that very few of his peers would have been able to do. He has taken a black art, Capacity Planning, and he turned it in to a series of steps that anyone can follow.
The book is filled with common case studies, for how to plan capacity for things like web server farms, database clusters, and caching layers. The real value is in watching how the author applies the same formula in each case, giving Systems Administrators and Executives the tools they need to do a better job of capacity planning in their own unique infrastructures.
As the earlier review says, it's a short book. In my opinion, that's a good thing: it's goal is to teach you how to perform capacity planning in any environment. If it was longer, it would have been full of more examples, which would likely only serve to lead the reader away from the core principles. You need to learn *how* to capacity plan an infrastructure, not get pat (and often incorrect) advice on how to measure your web farm.
The discussion on curve fitting and trend prediction is worth it alone - I'm aware of no other book on the topic that shows so clearly how to examine your data in service of capacity planning.
It's the process I'll follow from now on.
Was this review helpful to you?
Right out of the gate, John covers a topic near and dear to my heart: metrics. His advice? "Measure, measure, measure." John reinforces this by including an incredible number of charts throughout the book. He goes on to say that our measurement tools need to provide an easy way to: * Record and store data over time * Build custom metrics * Compare metrics from various sources * Import and export metrics
As I read the book, I found myself nodding and thinking, "yes, yes, this is exactly what I learned!" Although it's been more than five years since I was buildmaster for My Yahoo!, I really resonated with the advice John provides, like this one: "Homogenize hardware to halt headaches". (You have to love the alliteration, too.)
In a thin book that's easy to read, John covers a large number of topics. He talks about load testing, with pointers to tools like Httperf and Siege. There are several sections that talk about caching architectures and the use of Squid. He provides guidelines when it comes to deployment, such as making all changes happen in one place, the importance of defining roles and services, and ensuring new servers start working automatically. At the end he even manages to cover virtualization and cloud computing, and how they come into play during capacity planning.
The Art of Capacity Planning is full of sage advice from a seasoned veteran, like this one: "The moral of this little story? When faced with the question of capacity, try to ignore those urges to make existing gear faster, and focus instead on the topic at hand: finding out what you need, and when." When I read a technical book, I'm really looking for takeaways. That's why I loved The Art of Capacity Planning, and I think you will, too.
Was this review helpful to you?
The Art of Capacity Planning is a good introduction to Capacity Planning for Web Operations that touches on the following topics: * Why do you need capacity planning? * What information should you gather for capacity planning and how? * How to predict trends for your web applications? * How and when to procure new hardware? * How to create a sustainable capacity planning process?
As the author mentions in the preface, the book has a lot of common sense material. Most experienced enterprise web operations architects should be familiar with this material. But, it is refreshing to see this urban wisdom captured and printed in a book format. The book is unique in that it is not meticulously organized and illustrated like a text book or a reference guide. It provides a smattering of anecdotes, examples, gotchas, and tools from the author's experience in a rapidly growing start up environment at Flickr.
I am looking forward to a second edition of the book where the author can delve deeper into some missing aspects that are critical to capacity planning like log analysis and performance improvements. Enterprise web operations folks who are familiar with commercial tools like Sitescope, OpenView, Opsware, Gomez, etc. rather than free/open source tools and who manage a large number of diverse applications might have a learning curve to relate the examples in the book to their environment.
This is the first book on capacity planning I have read so I have nothing substantial to compare it too at this time. John's descriptions and real world examples are great.
While I was reading I felt John's analogies were very similar to the way the character Charlie from TV's "Numb3rs" explains something very complicated with a real world examples. I liked the examples of the Bacon Delivery truck and the Super-market checkout especially to visualize what was going on in the process of the servers.
One huge take away was the level of importance tying application metrics and server metrics back to financial costs. SLA's don't really matter if the cost of adding another 9 to the 99.999's type model is more expensive than your client is paying you for the whole contract. In essence don't promise 99.9% over 99.0 percent if the .9 improvement will cost $10,000 in additional hardware and the contract is only worth $10,000. Many would argue but it is only a 9/10ths of a percent improvement how big of a deal can it be? Remember the first 1% of keeping up a server is not the same as the last 1%.
The chapter on regression and line fitting was mostly a refresher. The chapters on cloud computing were excellent as real world examples are always useful for me. I also liked the fact he referred to flickr a lot, so there was a sense of walking the path vs. knowing the path.
Some co-workers did joke that they must not know what they are doing because the seats are all empty on the cover. I'd be curious to see if the same book sold better with the same cover and seats filled. Other comments criticize the book for being only 150 pages but I would rather have 150 good pages than 300 bad pages any day of the week. Also the author explains the smallish size in the preface.
All in all a great quick read that cut to the details and made me feel more confident I could bridge the gap between business and IT in a short amount of time.
Was this review helpful to you?