The Art of Computer Programming: Volume 3: Sorting and Searching (2nd Edition) 2nd Edition

4.8 out of 5 stars 16 customer reviews
ISBN-13: 978-0201896855
ISBN-10: 0201896850


Editorial Reviews

From the Inside Flap

Cookery is become an art,
a noble science;
cooks are gentlemen.
TITUS LIVIUS, Ab Urbe Condita XXXIX.vi
(Robert Burton, Anatomy of Melancholy

This book forms a natural sequel to the material on information structures in Chapter 2 of Volume 1, because it adds the concept of linearly ordered data to the other basic structural ideas.

The title "Sorting and Searching" may sound as if this book is only for those systems programmers who are concerned with the preparation of general-purpose sorting routines or applications to information retrieval. But in fact the area of sorting and searching provides an ideal framework for discussing a wide variety of important general issues:

How are good algorithms discovered? How can given algorithms and programs be improved? How can the efficiency of algorithms be analyzed mathematically? How can a person choose rationally between different algorithms for the same task? In what senses can algorithms be proved ''best possible''? How does the theory of computing interact with practical considerations? How can external memories like tapes, drums, or disks be used efficiently with large databases?

Indeed, I believe that virtually every important aspect of programming arises somewhere in the context of sorting or searching!

This volume comprises Chapters 5 and 6 of the complete series. Chapter 5 is concerned with sorting into order; this is a large subject that has been divided chiefly into two parts, internal sorting and external sorting. There also are supplementary sections, which develop auxiliary theories about permutations (Section 5.1) and about optimum techniques for sorting (Section 5.3). Chapter 6 deals with the problem of searching for specified items in tables or files; this is subdivided into methods that search sequentially, or by comparison of keys, or by digital properties, or by hashing, and then the more difficult problem of secondary key retrieval is considered. There searching related to sorting is a surprising amount of interplay between both chapters, with strong analogies tying the topics together. Two important varieties of information structures are also discussed, in addition to those considered in Chapter 2, namely priority queues (Section 5.2.3) and linear lists represented as balanced trees (Section 6.2.3).

Like Volumes 1 and 2, this book includes a lot of material that does not appear in other publications. Many people have kindly written to me about their ideas, or spoken to me about them, and I hope that I have not distorted the material too badly when I have presented it in my own words.

I have not had time to search the patent literature systematically; indeed, I decry the current tendency to seek patents on algorithms (see Section 5.4.5). If somebody sends me a copy of a relevant patent not presently cited in this book, I will dutifully refer to it in future editions. However, I want to encourage people to continue the centuries-old mathematical tradition of putting newly discovered algorithms into the public domain. There are better ways to earn a living than to prevent other people from making use of one's contributions to computer science.

Before I retired from teaching, I used this book as a text for a student's second course in data structures, at the junior-to-graduate level, omitting most of the mathematical material. I also used the mathematical portions of this book as the basis for graduate-level courses in the analysis of algorithms, emphasizing especially Sections 5.1, 5.2.2, 6.3, and 6.4. A graduate-level course on concrete computational complexity could also be based on Sections 5.3, and 5.4.4, together with Sections 4.3.3, 4.6.3, and 4.6.4 of Volume 2.

For the most part this book is self-contained, except for occasional discussions relating to the MIX computer explained in Volume 1. Appendix B MIX computer contains a summary of the mathematical notations used, some of which are a little different from those found in traditional mathematics books. Preface to the Second Edition

This new edition matches the third editions of Volumes 1 and 2, in which I have been able to celebrate the completion of TeX and MF by applying those systems to the publications they were designed for.

The conversion to electronic format has given me the opportunity to go over every word of the text and every punctuation mark. I've tried to retain the youthful exuberance of my original sentences while perhaps adding some more mature judgment. Dozens of new exercises have been added; dozens of old exercises have been given new and improved answers. Changes appear everywhere, but most significantly in Sections 5.1.4 (about permutations and tableaux), 5.3 (about optimum sorting), 5.4.9 (about disk sorting), 6.2.2 (about entropy), 6.4 (about universal hashing), and 6.5 (about multidimensional trees and tries).

The Art of Computer Programming is, however, still a work in progress. Research on sorting and searching continues to grow at a phenomenal rate. Therefore some parts of this book are headed by an ''under construction'' icon, to apologize for the fact that the material is not up-to-date. For example, if I were teaching an undergraduate class on data structures today, I would surely discuss randomized structures such as treaps at some length; but at present, I am only able to cite the principal papers on the subject, and to announce plans for a future Section 6.2.5 (see page 6.2.5). My files are bursting with important material that I plan to include in the final, glorious, third edition of Volume 3, perhaps 17 years from now. But I must finish Volumes 4 and 5 first, and I do not want to delay their publication any more than absolutely necessary.

I am enormously grateful to the many hundreds of people who have helped me to gather and refine this material during the past 35 years. Most of the hard work of preparing the new edition was accomplished by Phyllis Winkler (who put the text of the first edition into TeX form), by Silvio Levy (who edited it extensively and helped to prepare several dozen illustrations), and by Jeffrey Oldham (who converted more than 250 of the original illustrations to METAPOST format). The production staff at Addison Wesley has also been extremely helpful, as usual.

D. E. K.
Stanford, California
February 1998

There are certain common Privileges of a Writer,
the Benefit whereof, I hope, there will be no Reason to doubt;
Particularly, that where I am not understood, it shall be concluded,
that something very useful and profound is coucht underneath.
JONATHAN SWIFT, Tale of a Tub, Preface (1704)


From the Back Cover

The bible of all fundamental algorithms and the work that taught many of today's software developers most of what they know about computer programming.

Byte, September 1995

I can't begin to tell you how many pleasurable hours of study and recreation they have afforded me! I have pored over them in cars, restaurants, at work, at home... and even at a Little League game when my son wasn't in the line-up.

—Charles Long

If you think you're a really good programmer... read [Knuth's] Art of Computer Programming... You should definitely send me a resume if you can read the whole thing.

—Bill Gates

It's always a pleasure when a problem is hard enough that you have to get the Knuths off the shelf. I find that merely opening one has a very useful terrorizing effect on computers.

—Jonathan Laventhol

The first revision of this third volume is the most comprehensive survey of classical computer techniques for sorting and searching. It extends the treatment of data structures in Volume 1 to consider both large and small databases and internal and external memories. The book contains a selection of carefully checked computer methods, with a quantitative analysis of their efficiency. Outstanding features of the second edition include a revised section on optimum sorting and new discussions of the theory of permutations and of universal hashing.

Product Details

  • Hardcover: 800 pages
  • Publisher: Addison-Wesley Professional; 2 edition (May 4, 1998)
  • Language: English
  ISBN-10: 0201896850
  ISBN-13: 978-0201896855
  • Product Dimensions: 6.7 x 2.1 x 9.4 inches
  • Shipping Weight: 3 pounds (View shipping rates and policies)
  • Average Customer Review: 4.8 out of 5 stars  See all reviews (16 customer reviews)
  • Amazon Best Sellers Rank: #259,623 in Books (See Top 100 in Books)

Customer Reviews

Top Customer Reviews

By wiredweird HALL OF FAMETOP 1000 REVIEWER on November 4, 2006
Format: Hardcover Verified Purchase
First the basics: it's great, it provides wide-ranging and deep analysis, it shows many views and variants of each problem, and its bibliography is helpful, though not exhaustive. The historical notes, including sorts for drum storage, may seem quaint to modern readers. And sorting has been done, right? You just run a shell program or call a function, and tap into the best technology. Does it need to be done again?

Yes, if you're on the edge of technology, it does need to be done again, and again, and again. That's because technology keeps expanding, and violating old assumptions as it does. Memories got big enough that the million-record sort is now a yawn, where it used to be a journal article. But, at the same time, processor clocks got 100-1000x ahead of memory speeds. All of a sudden, those drum-based algorithms are worth another look, because yesteryear's drum:memory ratios are a lot like today's memory:cache ratios of size and speed - and who doesn't want a 100x speedup? Parallel processing is moving from the supercomputing elite into laptops, causing more tremors in the ground rules. GPU and reconfigurable computing also open whole new realms of pitfalls as well as opportunities.

Knuth points out that the analyses have beauty in themselves, for people with eyes to see it. His analyses also demonstrate techniques applicable way beyond the immediate discussion, too. Today, though, I have nasty problems in technologies that no one really knows how to handle very well. I have to go back and check all the assumptions again, since so many of them changed. If that's the kind of problem you have, too, then this is the place to start.

Format: Hardcover
This book in a keystone work of computer science. Now and then one needs a "binary search" or a related algorithm, and Knuth's book has it. Such algorithms, although basic, are notoriously easy to get wrong. The style of writing requires the reader to have some mathematics and programming background. Otherwise a reader will need to study the writing style and algorithm description.

Computer Scientists are waiting for this skilled practitioner to finish his life's work, namely Vols. 4-7. Let us hope the author has the patience and time to accomplish it.
Format: Kindle Edition Verified Purchase
I can not make the claim that I have fully worked the exercises. Indeed, I have sadly barely touched them. However, I do feel like I learned something from this book every time I picked it up.

The writing style remains much more approachable than you probably think it is. Specifically, even the heavy math sections are fun to read through as an interested programmer. Sure, it is intimidating in that I don't think I fully followed the thought process on many sections on my first read through. However, coming back and trying multiple times usually left me feeling like I at least understood what was being discussed. Even if I'm not quite sure, yet, that I could have hit on some of these ideas myself.

The section at the end on searching on secondary keys is a true delight. Just plain fun to consider the different tricks that can be done with data.
Format: Hardcover
I just bought the book I needed out of the set. I needed to build a database that did not use any commercial package (this gives full access and no royalties). This book saved my bacon. I almost did not buy it when all I saw in it was math. But I was desperate and it paid off. Turns out you could not explain it any other way. This book goes way beyond binary, and bubble sorts. I use it primarily for balanced trees. I may try some thing more exotic later. I can not tell you about the other volumes but this one will defiantly pay for its self.

Art of Computer Programming, The, Volumes 1-3 Boxed Set (2nd Edition) (The Art of Computer Programming Series) (Vol 1-3)
Format: Hardcover
This book is bible of computer programming.
It contains most detailed explanation of searching and sorting methods I ever found in a book. Contains all internal sorting and searching and external sorting and searching algorithms.
The only drawback of the book is that all algorithms are written in MIX - some kind of assembler, and because of that they are hard to read.
Format: Hardcover Verified Purchase
This book is getting seriously old now. Its logic will last for ever but its routines are in the style of 1960s Assembler, whereas modern computer languages offer quicker and much more efficient coding, together with far greater readability.

Numerical Recipes went for C++, which looks like a wrong decision, now that Python and C# both offer more and have fewer architectural defects that C++. C++ will of course continue to be used, because great chunks of our everyday life depend on it, but new initiatives are bound to go elsewhere if objective comparisons are made.

Who knows, will another language come along (perhaps based on functional programming) that in turn render Python and C# obsolete? Perhaps we need a well-thought-out pseudo code which can be extended and will last essentially for ever?

Another point is that in the book Donald worries about (for example) whether Heap sort is faster than Quick sort. In real life, computers are so fast that it is no longer relevant whether a routine sorts a million integers in, say, 24ms or in 18ms, for that is where we have gotten to.

If this book is ever brought up to date, which I doubt, what will be important will be the robustness and ease of debugging that will dominate, where in Donald's day it had to be speed of execution.

One approach would be to issue brief updates as the state of the art evolves, to go to all have purchased copies. Frequency? When there enough significant advances to make it worthwhile. For example, if a new compiler includes fast and efficient binary tree routines, which would obsolete all current implementations of Heap sort.

Thank you, Donald, for you have laboured like Hercules to produce these three books. Without your skills the books could never have been written.
