- Use promo code PRIMEBOOKS18 to save $5.00 when you spend $20.00 or more on Books offered by Amazon.com. Enter code PRIMEBOOKS18 at checkout. Here's how (restrictions apply)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Other Sellers on Amazon
+ Free Shipping
PostgreSQL 9 High Availability Cookbook Paperback – July 17, 2014
|New from||Used from|
There is a newer edition of this item:
See the Best Books of 2018 So Far
Looking for something great to read? Browse our editors' picks for the best books of the year so far in fiction, nonfiction, mysteries, children's books, and much more.
Frequently bought together
Customers who bought this item also bought
Special offers and product promotions
About the Author
Shaun M. Thomas
Shaun M. Thomas has been working with PostgreSQL since late 2000. He is a frequent contributor to the PostgreSQL Performance and General mailing lists, assisting other DBAs with the knowledge he's gained over the years. In 2011 and 2012, he gave presentations at the Postgres Open conference on topics such as handling extreme throughput, high availability, server redundancy, and failover techniques. Most recently, he has contributed the Shard Manager extension and the walctl WAL management suite. Currently, he serves as the database architect at OptionsHouse, an online options brokerage with a PostgreSQL cluster that handles almost 2 billion queries per day. Many of the techniques used in this book were developed specifically for this extreme environment. He believes that PostgreSQL has a stupendous future ahead, and he can't wait to see the advancements subsequent versions will bring.
If you buy a new print edition of this book (or purchased one in the past), you can buy the Kindle edition for only $2.99 (Save 89%). Print edition purchase must be sold by Amazon. Learn more.
For thousands of qualifying books, your past, present, and future print-edition purchases now lets you buy the Kindle edition for $2.99 or less. (Textbooks available for $9.99 or less.)
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
All in all, a very well-rounded guidebook which highlights best practices currently in use, and the guidance contained herein will be relevant for many years to come. I own both the print edition and the e-book, it is THAT valuable.
In the beginning, the author admits that he does not cover cloud specific Postgres high availability methods. Well, it leaves an opportunity for somebody else to write a book dedicated to Postgres in a cloud. Also, the subject of high availability is huge and cannot be fully covered in a limited format of a cookbook. Anyway, the majority of the book's material is relevant in a cloud environment, too.
The whole first chapter "Hardware Planning", perhaps, may have some value for a new to the subject users, but only to get basic ideas. Some recipes in this chapter are obvious, very basic, or oversimplified. Just one example, “Having enough IOPS” (p.11) is oversimplified in its relying on arbitrary assumptions. It is not clear why the author assumes that 3.5" hard drives produce 500 IOPS and 2.5” drives 350 IOPS (page 12). And don’t believe that you can get 500 IPOS from a 15K RPM drive. Even storage vendors usually claim no more than 180 IOPS. We are talking about random IOPS here, right? It does not make sense to plan a system for perfectly sequential IOs. I could continue complaining about the first chapter. Towards the end of it I was seriously thinking about putting the book away.
My patience was fully rewarded in the consecutive chapters.
As a cookbook should, it provides a bunch of handy queries. An example of a recipe with handy queries is “Identifying important tables” on page 53. There are also many very useful techniques like “Defusing cache poisoning” (how to avoid database slowness caused by empty caches after a crash), “Exploring the magic of virtual IPs” (how to switch to a standby server without using additional software), “Terminating rogue connections” (how to kill connection which does not want to die). These are just a few examples out of many.
The author recommends and explains multiple handy Postgresql extensions and Linux tools throughout the book. dstat, iotop, and iostat are just a few out of really many. Hopefully, the readers already know how to use iostat or sar, but some other recommended tools are less known. Honestly, some tricks were new for me. In the end, I felt satisfied with the book. This book may teach new users many useful technics, tools, and queries. It also not just provides recipes, but in most cases provides good insights on how the things work. Experienced users may just use it as a source of readily available queries and commands and save time on producing their own. I would totally recommend this book to anybody who needs to maintain Postgresql databases.
One of the strongest aspects of the book is the author’s principled and well-structured engineering approach to building a highly available PostgreSQL system. Instead of jumping to some recipes to be memorized, the book teaches you basic but very important principles of capacity planning. More importantly, this planning of servers and networking is not only given as a good template, but the author also explains the logic behind it, as well as drawing attention to the reason behind the heuristics he use and why some magic numbers are taken as a good estimate in case of lack of more case-specific information. This style is applied very consistently throughout the book, each recipe is explained so that you know why you do something in addition to how you do it.
After the first chapter on basic planning, the author jumps to a set of miscellaneous topics in the Chapter 2, and details some important tricks such as defusing cache poisoning, concurrent indexes, and Linux kernel tweaks. This chapter starts to reveal another valuable aspect of the book: the information regarding an open source RDBMS such as PostgreSQL is freely available on the Internet, but depending on your needs, a particular set of information can very well be scattered over a lot of e-mail list messages, forum posts, Wiki pages, etc., and it takes a disciplined mind with a lot of field experience to put all of that scattered information into a single, consistent, logical and easy to follow form.
Starting from Chapter 3, each chapter explores a single topic in a lot of practical detail, starting with connection pooling. This chapter, as well as almost all of the remaining ones has a nice feature: the author always try to explain alternative solutions, describes their advantages and disadvantages, and where possible shows how to combine some alternatives to get best of each.
Chapters 4 and 5, namely Troubleshooting and Monitoring can be thought as a single chapter, because it is difficult to think these fundamental concepts separately. These chapters are also not only valuable for PostgreSQL DBAs but for any DBA or any GNU/Linux system administrator in general. Troubleshooting and monitoring a highly available database requires a book by itself, but since this book’s scope is clearly defined, the author provides enough background and practical starting points in about 70 pages.
I can easily say that Chapter 6: Replication, together with Chapter 7: Replication Management Tools starts to form the ‘meat’ of the book; without successfully implementing and practically managing the replication of your critical database servers, it is impossible to think about building a highly available system, in other words, you need at least one replica of your database system, so that if your primary system goes down, you can very easily switch to your replica (or offload some of your less criticial applications to your replica and relive the stress on your primary system). These two chapters presents you the solid and practical information to achieve that goal. Similar to the previous chapters, the author shows and explains many useful and practical tools, he also does not refrain from presenting an open source tool, walctl, that he developed to as a “PostgreSQL WAL management system that pushes or pulls WAL files from a remote central storage server”. I consider another positive point for the book because it clearly indicates the serious time investment of the author for PostgreSQL and its high availability configuration.
Chapter 8: Advanced Stack, is aptly named, because this chapter, together with Chapter 9: Cluster Control, forms the most advanced and complex part of the book. The author’s warnings regarding the information density, and related real-life complexity of the topics explained in these two chapters should not be taken lightly. Indeed, there are many combinations of events that can lead to subtle and hard to debug errors in case of clusters set up to take over from failing nodes. Creating such a highly available system with Linux based tools such LVM, XFS, DRBD, Pacemaker, and Corosync requires careful planning, probably experimenting in a safe virtual environment, and then a disciplined execution, as well as monitoring. Again, these chapters alone include topics that can take a volume, and a detailed training by themselves, and I think the author kept a good balance between depth and breadth.
Final chapter, Data Distribution, can be considered as a bonus chapter that briefly shows setting up a PostgreSQL server, dealing with foreign tables, managing shards, creating a scalable nextval replacement, and relevant tips and tricks.
There are not many negative sides to this very dense PostgreSQL book. A few minor points that deserves mention are its focus on the most popular Linux distributions such as Red Hat, Debian and their derivatives (FreeBSD and other BSD admins will require slightly more effort), some obsolete networking command usage such as ifconfig instead of ip (but then again, this might be helpful for FreeBSD admins), and inconsistent use of command outputs (sometimes no output is shown, whereas for some commands screen-shots or textual outputs are used inconsistently). One might also argue for a slight reordering of chapters for pedagogical concerns, but then again this is highly open to debate and one’s particular preferences when it comes to system and database administration.
I can recommend PostgreSQL 9 High Availability Cookbook without hesitation to PostgreSQL DBAs who want to push their skill to the next level, and learn the fundamentals of building highly available PostgreSQL based database clusters. It certainly will not be as easy as reading a book, but it is good to know such a book exists as a very good guide.
For example recipes for counting storage size, IOPS, cpu, memory etc...
Also very specific guides how to use different replications and their monitoring + management and HA solutions.