Customer Reviews


2 Reviews
5 star:    (0)
4 star:
 (2)
3 star:    (0)
2 star:    (0)
1 star:    (0)
 
 
 
 
 
Average Customer Review
Share your thoughts with other customers
Create your own review
 
 
Only search this product's reviews
Most Helpful First | Newest First

6 of 6 people found the following review helpful:
4.0 out of 5 stars Academic textbook for a course in fault tolerance, March 25, 2007
By 
Dmitry Dvoinikov (Ekaterinburg, Russia) - See all my reviews
This review is from: Fault Tolerance in Distributed Systems (Paperback)
It's a pity that I got my hands on this book so late. It would have been much better if I got it soon after it was published in 1994, when I was in the university. Because the book is largely academic. To quote:

-- QUOTE

This book is an attempt to organize the body of knowledge in the area of software fault tolerance. ... [It] can be used as a textbook for a graduate/senior level course on fault tolerance ... or for a professional course in fault tolerance. It can also be used as a reference by researchers/practitioners ...

-- END QUOTE

The book has a nice systematic approach in that it attempts to clearly define what is a system, what is a failure and so on and so forth.

It takes the route of explaining that the distributed system is built around a set of communicating processes running on a different nodes and how redundancy, anything unnecessary and existing entirely for the ability for tolerating faults, is added.

But the big problem is - the book focuses on the joints, not on the bones, so to speak. It tells you about processes running on different computers and how they talk to each other and what can be done to ensure those conversations have certain properties. It speaks about joints - (network) protocols required for the processes to become a distributed system. In that the book reminds a lot of Tannenbaum and van Steen's "Distributed Systems: Principles and Paradigms".

But where it concerns the bones, the processes, all it says is "the process saves its state to persistent storage" or "the process recovers to the most recently established checkpoint". Uh-huh, sounds great, thanks. Needs hell lot of work to build a process in such a way that its state as a whole can be saved and restored from a stable storage. There are other problems for sure.

Eight out of nine book's chapters keep going around like that, telling you about all sorts of protocols for negotiations, clock synchronization, such and such broadcast, distributed snapshots, distributed transactions, voting and duplicating running processes. To be fair, there are quite a few interesting protocols that are nice to be familiar with, just in case.

The ninth and the last chapter in 40 pages touches writing fault-tolerant processes. It explains how redundancy can be added to the code and suggests a specific approach to cutting and wrapping all your code to special small boxes to ensure certain behaviour. Although it's difficult to argue with that, it's definetely not the only possible way.

Throughout the book, nearly everything of use quickly gets so complicated, that it's impractical for every use, and the author frequently admits that. I can see it working in calculations, numerical algorithms, even in the ever so proudly sounding aircraft control, in other words - where there is a single simple input, single simple output, simple logic, totally deterministic, no concurrency or shared state.

-- QUOTE

The schemes discussed above [...] require each process to be deterministic, i.e. given the same inputs, the process performs the same actions. Both of these assumptions do not hold, for example, in languages like CSP and Ada ...

-- END QUOTE

I'd say, it'not just CSP and Ada which do not have total determinism, but a lot of real systems too, no matter which language.

The book has next to none practical examples, you hardly ever get a name of the system which implements this or that, and may be a couple of times you get a brief description of a specific implementation. Most of the time such referencing is done in a scientific way, like "Aristole has shown this in [Ars/1378BC]". Good if you have access to the sources and/or time to look it up.

The book indeed makes a nice textbook for a course, but less so a practical reference. Although the matters discussed in it are unlikely to become obsolete, there probably are a lot of newer books on the subject. Will go look for them.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


4 of 7 people found the following review helpful:
4.0 out of 5 stars Somewhat outdated but comprehensive, October 22, 2000
By A Customer
This review is from: Fault Tolerance in Distributed Systems (Paperback)
This book is already somewhat outdated (six years) in relation to the cutting edge of fault tolerance research but it's a good and comprehensive introduction to the subject, and great to programmers looking for some understaning of fault tolerance, as commercial tools still have much to go before catching up with what is in this book. This field is getting more and more important as business systems are being moved to the internet and need to remain operarional 24/7.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


Most Helpful First | Newest First

This product

Fault Tolerance in Distributed Systems
Fault Tolerance in Distributed Systems by P. Jalote (Paperback - April 16, 1994)
$74.67 $59.13
In Stock
Add to cart Add to wishlist