- Paperback: 528 pages
- Publisher: Morgan Kaufmann; 1 edition (March 14, 2008)
- Language: English
- ISBN-10: 7111247353
- ISBN-13: 978-0123705914
- ASIN: 0123705916
- Product Dimensions: 7.5 x 1.1 x 9.2 inches
- Shipping Weight: 1.7 pounds (View shipping rates and policies)
- Average Customer Review: 30 customer reviews
- Amazon Best Sellers Rank: #850,185 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
The Art of Multiprocessor Programming 1st Edition
Use the Amazon App to scan ISBNs and compare prices.
Fulfillment by Amazon (FBA) is a service we offer sellers that lets them store their products in Amazon's fulfillment centers, and we directly pack, ship, and provide customer service for these products. Something we hope you'll especially enjoy: FBA items qualify for FREE Shipping and Amazon Prime.
If you're a seller, Fulfillment by Amazon can help you increase your sales. We invite you to learn more about Fulfillment by Amazon .
There is a newer edition of this item:
"Children of Blood and Bone"
Tomi Adeyemi conjures a stunning world of dark magic and danger in her West African-inspired fantasy debut. Learn more
Frequently bought together
Customers who bought this item also bought
Customers who viewed this item also viewed
About the Author
Maurice Herlihy received an A.B. in Mathematics from Harvard University, and a Ph.D. in Computer Science from M.I.T. He has served on the faculty of Carnegie Mellon University, on the staff of DEC Cambridge Research Lab, and is currently a Professor in the Computer Science Department at Brown University. Maurice Herlihy is an ACM Fellow, and is the recipient of the 2003 Dijkstra Prize in Distributed Computing. He shared the 2004 Gödel Prize with Nir Shavit, the highest award in theoretical computer science. In 2012 he shared the Edsger W. Dijkstra Prize In Distributed Computing with Nir Shavit.
Nir Shavit received a B.A. and M.Sc. from the Technion and a Ph.D. from the Hebrew University, all in Computer Science. From 1999 to 2011 he served as a member of technical staff at Sun Labs and Oracle Labs. He shared the 2004 Gödel Prize with Maurice Herlihy, the highest award in theoretical computer science. He is a Professor in the Electrical Engineering and Computer Science Department at M.I.T. and the Computer Science Department at Tel-Aviv University. In 2012 he shared the Edsger W. Dijkstra Prize In Distributed Computing with Maurice Herlihy.
Author interviews, book reviews, editors picks, and more. Read it now
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
I can't understany why a professor of computer science would write code, not even try the code, and put it in a book for millions to see.
I tried to find an errata for the book, but I could not find one. I attempted to contact the author through the publishers website, was unable.
It is barely useful for a Java/C# programmer (as long as you ignore all the broken code presented), and take the rest of it with a grain of salt, they have the nerve to publish multithreaded algorithm implementations that they didn't even try on a computer.
I would not recommend the book. Most of the algorithms, once YOU have debugged them, require garbage collection anyway, it is of very little value to a C++ programmer.
Practitioners that are already well versed in parallel programming can jump directly to Chapter 7, however, I would suggest at least skimming Chapters 2, 3 and 4. Even those programmers who understand shared memory and locking may be shocked at how relaxed memory models or compiler optimizations can reorder operations causing innocent looking code to break.
Chapter 1 - Introduction
Why is this book called "The Art of Multiprocessor Programming" and not "The Art of Parallel Programming?" It is not by accident. There is a directed effort to explain parallel programming concepts as they relate to multi-core (or many-core) architectures. In particular, shared-memory multiprocessors have specific implementation details, such as cache coherence policies, that directly affect parallel software run on such architectures. The introduction gives a brief overview of the direction of the text: principles and practice.
Part 1 - Principles
Chapter 2 - Mutual Exclusion
Mutual exclusion is a key concept to multi-threaded programming, and this chapter is rightly placed at the beginning of the text. This chapter presents some of the foundational concepts in parallel computing, such as, understanding time related to operation interleavings, pessimistic critical sections, forward progress, deadlocks and fairness. In addition, some of the classic algorithms are presented here, such as Lamport's Ticket Locking and Peterson's 2-Threaded Lock.
Chapter 3 - Concurrent Objects
This chapter starts off simple, but gets complex fast. While experts will understand and acknowledge the importance of this chapter, less experienced programmers will find it very challenging to understand and may be turned off: don't give up!
My suggestion to non-experts is to focus on understanding two concepts of this chapter: hardware sequential consistency (3.4) and software linearizibility (3.5). Once you understand both concepts, skim all other sections except section 3.8.
Java programmers may want to pay special attention to the Java Memory Model section (3.8) and ill-formed unsynchronized code. General programmers will also be interested in this section as it is important to understand how the hardware's memory consistency model, the programming language's memory model and the compiler's operation reordering optimizations may interfere with what "looks like" correct code.
Do not be discouraged by the difficult of this chapter. It is one of the most difficult chapter in the text. Get through it and keep reading.
Chapter 4 - Foundations of Shared Memory
This chapter concentrates on understanding shared memory, the cornerstone of all multi-threaded applications. It explains how to implement shared memory that behaves "correctly" without using mutual exclusion. The different types of memory that are discussed are single-reader single-writer (SRSW), multiple-reader single-writer (MRSW) and multiple-reader multiple-writer (MRMW). This is an important chapter for non-experts to think about, as it explains how operation interleavings are not as discrete as we pretend they are and how shared memory should behave in all possible cases.
Chapter 5 - The Relative Power of Primitive Synchronization Operations
This chapter explains the varying strength of different wait-free synchronization primitives. Consensus numbers will undoubtedly confuse novice parallel programmers. In short, the higher the consensus number the better. A high consensus number, say N, for a synchronization primitive means that synchronization primitive can "correctly" solve the consensus problem for N concurrently executing threads. For example, critical sections have an infinite consensus number (e.g. support an infinite number of concurrent threads). Atomic registers have a consensus number of 1, they support only 1 thread's execution that is guaranteed to consistently and validly solve the consensus problem.
The most important point of this chapter (in my opinion) is that compare-and-swap (CAS), or compare-and-set, has an infinite consensus number (section 5.8). This is why modern instruction set architectures (ISAs) all provide CAS: it is critical to supporting an unlimited number of concurrently executing threads. Realizing the importance of CAS is vital for advanced parallel programmers who want to implement nonblocking algorithms.
Chapter 6 - Universality of Consensus
This chapter explains how to build universal consensus for your own concurrent objects. While it will be an interesting chapter for experts, novices may want to skip it.
Part II - Practice
Chapter 7 - Spinlocks and Contention
This chapter explains the important differences between different types of locking. It explains how to implement locks using assembly level operations (test-and-set and test-and-test-and-set), how to reduce bus contention using backoff, how to reduce cache pressure by having threads spin on their local cache memory and how to manage an unknown number of threads using locks.
After reading this chapter, most readers should have an appreciation for the hardware complexity of implementing something as simple as a lock. Some programmers may argue that they should not need to know how hardware behaves. While I would like to agree, the unfortunate state of multi-core programming currently requires a basic understanding of memory consistency models and cache behaviors. Herlihy and Shavit note this and make an effort to address it in a "just what you need to know" fashion, as done in this chapter.
Chapter 8 - Monitors and Blocking Synchronization
This chapter explains monitors, conditions, the differences between readers and writers, and reentrant locks. Java programmers will be especially interested in understanding monitors, while all OO programmers should have an appreciation of synchronizing an entire class. Moreover, the section on reentrant locks is simple but important to preventing deadlocks.
Unofficially, Chapters 9 - 11 focus on achieving parallelism in algorithms that have sequential bottlenecks and are therefore inherently sequential.
Chapter 9 - Linked Lists: The Role of Locking
The chapter explains how to implement ordered linked lists (e.g., IntSets or just sets) in a variety of different ways. The chapter starts out with the most basic implementation and then begins to increase performance by relaxing the strictness of the required thread synchronization.
Chapter 10 - Concurrent Queues and the ABA Problem
The chapter explains how to implement pools, a collection of unordered or ordered items. The chapter then explains the ways to implement pools as different types of queues, containers with first-in-first-out behavior. The chapter also explains a classic parallel problem known as ABA, where thread1 observes x == A, thread2 does x=B and then x=A and thread1 then observes x == A. The ABA problem is a subtle, but important problem.
Chapter 11 - Concurrent Stacks and Elimination
This chapter starts where the last chapter left off; it explains how to implement concurrent stacks, containers with first-in-last-out behavior. The chapter also explains a neat cancellation problem called elimination. Elimination is useful for avoiding overflows, underflows and other types of bounded problems.
Chapter 12 - Counting, Sorting and Distributed Coordination
This chapter explains how to take problems that seem to be inherently sequential and make them parallel. The chapter also explains both combining and diffracting trees, both of which are very interesting ways to make sequential problems parallel. This chapter is one of the more complex chapters of the text. Some readers may want to skim it.
Unofficially, Chapters 13 - 15 focus on achieving parallelism in algorithms that are inherently parallel. Readers will enjoy seeing how easy it is to extract parallelism in these naturally parallel algorithms.
Chapter 13 - Concurrent Hashing and Natural Parallelism
This chapter explains how to build parallel hash tables, with both open and closed addressing. First, the authors explain how to implement hash tables using coarse-grained locks, then with fine-grained locks, then with no locks at all. The chapter also explains how to deal with open-addressed hash tables which are particularly challenging.
Chapter 14 - Skiplists and Balanced Search
Most non-experts that make it this far in the text will be greatly rewarded by this chapter. Chapter 14 explains skiplists, an intriguing way to implement a container that has logarithmic search time and that is inherently parallel. Unlike red-black trees or balanced binary trees that yield logarithmic search time complexity, skiplists do not need to be rebalanced. Skiplists do not need to be rebalanced due to their unique algorithmic layering, making them inherently parallel. As such, skiplists have a notable benefit over their inherently sequential logarithmic search time counterparts.
This is a critically important chapter for practitioners hoping to exploit high parallelism while retaining logarithmic search time.
Chapter 15 - Priority Queues
This chapter explains how to implement priority queues, containers that are queues where each element has an identifiable level of importance. The authors demonstrate how to build priority queues with arrays, trees, heaps and skiplists.
Chapter 16 - Futures, Schedules and Work Distribution
This chapter presents some important aspects in understanding parallelism. In particular, the authors explain how to keep threads busy with work without causing the threads to become busy with "looking for work". The chapter also explains important ideas about the overhead of threads, stealing work, and sharing work. This chapter may cause some confusion to non-experts, but readers should try to understand at least the basic principles conveyed here as they are important to most general parallel programming problems.
Chapter 17 - Barriers
This chapter explains how to use barriers, a synchronization primitive that ensures threads move together through "gates" or "phases". Barriers are important in preventing threads from getting to far ahead or too far behind one another.
Chapter 18 - Transactional Memory
This chapter briefly describes a new parallel programming concept called transactional memory (TM). TM uses optimistic concurrency and greatly simplifies parallel programming. Herlihy and Shavit are responsible for HTM and STM, respectively.
The following ideas are touched on: HTM + cache coherence, composition, contention managers and transactional serialization. TM is currently receiving a lot of research attention and many researchers believe TM will soon become the new way to do parallel programming. Because of this, readers should pay particular attention to this chapter.
Herlihy and Shavit, begin by covering classical mutual exclusion algorithms that work by reading and writing shared memory. In addition, the authors examine various ways of specifying correctness and progress. They then describe the foundations of concurrent shared memory computing. Next, the authors show you a simple technique for proving statements of the form, where there is no wait-free implementation of X by Y. The authors continue by describing how to use consensus objects to build a universal construction that implements any concurrent object. In addition, they then try to make you understand how architecture affects performance, and how to exploit this knowledge to write efficient concurrent programs. The authors then show you how monitors are a structured way of combining synchronization and data. Next, the authors introduce several useful techniques that go beyond course-grained locking to allow multiple threads to access a single object at the same time. They continue by considering a kind of pool that provides first-in-first-out fairness. In addition, the authors show you how to implement concurrent stacks. They then show you how some important problems that seem inherently sequential can be made highly parallel by spreading out coordination tasks among multiple parties. Next, the authors look at concurrent hashing - a problem that seems to be naturally parallelizable or, using a more technical term: disjoint-access-parallel; meaning that concurrent method calls are likely to access disjoint locations, implying that there is little need for synchronization. They continue by looking at concurrent search structures with logarithmic depth. In addition, the authors describe how a priority queue typically provides an add() method to add an item to the set, and a removeMin() method to remove and return the item of minimal score. The authors then show you how to decompose certain kinds of problems into components that can be executed in parallel. Next, they discuss how a barrier is a way of forcing asynchronous threads to act almost as if they were synchronous. Finally, the authors review and analyze the strengths and weaknesses of the standard synchronization primitives, and describe some emerging alternatives that are likely to extend, and perhaps even to displace many of today's standard primitives.
This most excellent book focuses on computability: figuring out what can be computed in an asynchronous concurrent environment. Perhaps more importantly, this book deals with the practice of multiprocessor programming, and focuses on performance.
Most recent customer reviews
In the Java world this book deserves to stay right on the top together with Java...Read more