Shop the new tech.book(store)
New! Introducing the tech.book(store), a hub for Software Developers and Architects, Networking Administrators, TPMs, and other technology professionals to find highly-rated and highly-relevant career resources. Shop books on programming and big data, or read this week's blog posts by authors and thought-leaders in the tech industry.
> Shop now
John Allspaw is currently Operations Engineering Manager at Flickr, the popular photo site. He has had extensive experience working with growing web sites since 1999. These include online news magazines (Salon.com, InfoWorld.com, Macworld.com) and social networking sites that experienced extreme growth (Friendster and Flickr). During his time at Friendster, traffic increased 5X. He was responsible for their transition from a couple dozen servers in a failing data center to over 400 machines across two data centers, and the complete redesign of the backing infrastructure. When he joined Flickr, they had 10 servers in a tiny data center in Vancouver; they are now located in multiple data centers across the US. Prior to his web experience, Allspaw worked in modeling and simulation as a mechanical engineer doing car crash simulations for the NHTSA.
John has worked in systems operations for over fourteen years in biotech, government and online media. He started out tuning parallel clusters running vehicle crash simulations for the U.S. government, and then moved on to the Internet in 1997. He built the backing infrastructures at Salon.com, InfoWorld.com, Friendster, and Flickr. He is now VP of Tech Operations at Etsy, and is the author of "The Art of Capacity Planning" and "Web Operations" published by O'Reilly. He speaks from time to time at conferences on topics related to web operations, operations and development culture, infrastructure, and capacity planning.
John Allspaw has done something that very few of his peers would have been able to do. He has taken a black art, Capacity Planning, and he turned it in to a series of steps that anyone can follow.
The book is filled with common case studies, for how to plan capacity for things like web server farms, database clusters, and caching layers. The real value is in watching how the author applies the same formula in each case, giving Systems Administrators and Executives the tools they need to do a better job of capacity planning in their own unique infrastructures.
As the earlier review says, it's a short book. In my opinion, that's a good thing: it's goal is to teach you how to perform capacity planning in any environment. If it was longer, it would have been full of more examples, which would likely only serve to lead the reader away from the core principles. You need to learn *how* to capacity plan an infrastructure, not get pat (and often incorrect) advice on how to measure your web farm.
The discussion on curve fitting and trend prediction is worth it alone - I'm aware of no other book on the topic that shows so clearly how to examine your data in service of capacity planning.
It's the process I'll follow from now on.
Was this review helpful to you?
Right out of the gate, John covers a topic near and dear to my heart: metrics. His advice? "Measure, measure, measure." John reinforces this by including an incredible number of charts throughout the book. He goes on to say that our measurement tools need to provide an easy way to: * Record and store data over time * Build custom metrics * Compare metrics from various sources * Import and export metrics
As I read the book, I found myself nodding and thinking, "yes, yes, this is exactly what I learned!" Although it's been more than five years since I was buildmaster for My Yahoo!, I really resonated with the advice John provides, like this one: "Homogenize hardware to halt headaches". (You have to love the alliteration, too.)
In a thin book that's easy to read, John covers a large number of topics. He talks about load testing, with pointers to tools like Httperf and Siege. There are several sections that talk about caching architectures and the use of Squid. He provides guidelines when it comes to deployment, such as making all changes happen in one place, the importance of defining roles and services, and ensuring new servers start working automatically. At the end he even manages to cover virtualization and cloud computing, and how they come into play during capacity planning.
The Art of Capacity Planning is full of sage advice from a seasoned veteran, like this one: "The moral of this little story? When faced with the question of capacity, try to ignore those urges to make existing gear faster, and focus instead on the topic at hand: finding out what you need, and when." When I read a technical book, I'm really looking for takeaways. That's why I loved The Art of Capacity Planning, and I think you will, too.
Was this review helpful to you?
The Art of Capacity Planning is a good introduction to Capacity Planning for Web Operations that touches on the following topics: * Why do you need capacity planning? * What information should you gather for capacity planning and how? * How to predict trends for your web applications? * How and when to procure new hardware? * How to create a sustainable capacity planning process?
As the author mentions in the preface, the book has a lot of common sense material. Most experienced enterprise web operations architects should be familiar with this material. But, it is refreshing to see this urban wisdom captured and printed in a book format. The book is unique in that it is not meticulously organized and illustrated like a text book or a reference guide. It provides a smattering of anecdotes, examples, gotchas, and tools from the author's experience in a rapidly growing start up environment at Flickr.
I am looking forward to a second edition of the book where the author can delve deeper into some missing aspects that are critical to capacity planning like log analysis and performance improvements. Enterprise web operations folks who are familiar with commercial tools like Sitescope, OpenView, Opsware, Gomez, etc. rather than free/open source tools and who manage a large number of diverse applications might have a learning curve to relate the examples in the book to their environment.
The "Art" is an approachable treatment of a complex field of operations: capacity planning for high-traffic websites. Allspaw leverages his Flickr experience to give us a window into web operations as done by the pros.
The book keeps the high-level perspective necessary to give useful advice in a messy field, without getting lost in minutiae that would be specific to a given site. The author goes over the hows and whys of planning your capacity and the process needed to maintain it as traffic grows, with interesting insights such as designing for measurement (i.e. not mixing separate components of the architecture on the same machine in ways that hinders measurement of actual capacity), how to place a procurement process in place, and the ever-present point of presenting your data convincingly to the business owners that write the checks.
Allspaw places the emphasis on the right places, and does so in a concise manner: at less than 150 pages, this book packs a lot of meat for its pages, and as a fan of brevity the point did not go unnoticed on me. This is one of the best titles to come out of O'Reilly in the last few months, a must-have for your technical library if you work in the field.
Was this review helpful to you?