In programming, as in everything else, to be in error is to be reborn.
-- Alan J. Perlis
I wish I could start this preface by writing that the book you are holding in your hands is the result of a carefully planned premeditated publishing effort that started with the title Code Reading: The Open Source Perspective and is now being completed with Code Quality. Writing so would, however, be twisting the true facts, adjusting reality to the orderly world we engineers like to see around us. The truth is that Code Quality is mostly the result of a series of fortuitous accidents.
When I signed the contract to publish Code Reading, I had in my hands the outline and a couple of completed chapters. I naively calculated the book's length and the completion schedule, based on the length and effort of the chapters I had already written. Now, if you are writing software for a living, you can probably guess that at the time the manuscript was supposed to have been finished, I had covered just slightly more than half the chapters in the outline and had already used up all the allotted pages. Looking for a respectable exit strategy, I suggested to my editor publishing the material I had completed (minus a chapter on portability) as the first volume of Code Reading and continuing the rest of the work in a second volume.We agreed, and Code Reading got published, received a number of favorable reviews, appeared in the list of the 2004 Software Development Magazine Productivity Awards, and got translated into six other languages.
In Code Reading, by using real-life examples taken out of working, open source projects, I tried to cover most code-related concepts that are likely to appear before a software developer's eyes, including programming constructs, data types, data structures, control flow, project organization, coding standards, documentation, and architectures. My plan for the second volume was to cover interfacing and applicationoriented code, including the issues of internationalization and portability, the elements of commonly used libraries and operating systems, low-level code, domain-specific and declarative languages, scripting languages, and mixed-language systems. However, with Code Reading in the hands of programmers, I now had the benefit of readership opinions. The feedback I received indicated that many were eagerly waiting for the follow-up volume, but a detailed dissection of a device driver (one of the chapters I had left for a subsequent volume) was not the material they had in mind for it. In July 2003, my then editor, Mike Hendrickson, suggested working on a book titled Secure Code Reading. Although it security is an area that interests me as a scientist, I was loath to jump into the security book bandwagon and wrote a corresponding chapter instead. With one chapter on portability and one on security, I could suddenly see the book's theme and title before my eyes. Code Quality would focus on how to read and write software code, focusing on its quality attributes, those also often described as nonfunctional properties.
The nonfunctional properties we can discern from reading a software system's code are associated with the product's nonfunctional requirements: the requirements that are not directly concerned with specific functions delivered by the system but that deal with broader emergent system properties. Some common nonfunctional properties are the various -ilities of a system: reliability, portability, usability, interoperability, adaptability, dependability, and maintainability. Two other significant nonfunctional properties concern the system's efficiency: its performance related to time constraints and its space requirements.
The skill of reading code to discern its nonfunctional properties is crucial for two important reasons. First of all, a failure to satisfy a nonfunctional requirement can be critical, even catastrophic. A system that gets some functional requirements wrong (most software products contain such errors) may well be able to operate in a degraded mode; users can be instructed to avoid using some part of the functionality. On the other hand, errors in nonfunctional properties are often showstoppers: an insecure web server or an unreliable antilock brake system (ABS) are worse than useless. In addition, nonfunctional requirements are sometimes difficult to verify.We cannot write a test case to verify a system's reliability or the absence of security vulnerabilities. Therefore, both the critical nature of nonfunctional properties and the difficulty in verifying them suggest that when dealing with nonfunctional requirements and the corresponding software properties, we need to muster all the help we can get. The ability to associate code with nonfunctional properties can be a powerful weapon in a software engineer's arsenal.
Apart from the different perspective, Code Quality follows the successful recipe of Code Reading: focus on the reading of existing code, deal exclusively with realworld examples taken out of existing open source systems, reference all examples to their source, dissect code with annotated listings, provide meaningful exercises to strengthen the reader's critical ability and skills, identify coding idioms and traps in the text's margin, summarize each chapter's advice in the form of maxims, tie practice with theory in the Further Reading section, and use the Unified Modeling Language (UML) for all diagrams. From that recipe, the most tricky ingredient was my self-imposed rule to avoid toy examples, drawing all code samples from existing open source projects. By following the rule, I often found myself spending hours to find an appropriate example: one that would illustrate the concept I was presenting, while also being understandable and short enough to include in the book. I found this exercise both intellectually simulating and a great way to impose discipline on my writing. Often, while searching for a particular weakness, I encountered other interesting elements worthy of discussion. At other times, my search for an example of a theoretical concept proved fruitless: In those cases, I could then credibly reason that the concept was not important enough in practice to include in the text.
The rationale and motivation behind Code Quality are also the same as those that started Code Reading: The reading of code is likely to be one of the most common activities of a computing professional, yet it is seldom taught as a subject or formally used as a method for learning how to design and program. The popularity of open source software has provided us with a large body of code that we can all freely read and learn from. A primer and reader, based on open source software, can be a valuable tool for improving one's programming abilities. I therefore hope that the existence of the two books will spur interest to include code-reading courses, activities, and exercises in the computing education curriculum so that in a few years, our students will learn from existing open source systems, just as their peers studying a language learn from the great literature.
Content and Supplementary Material
I decided to base the source code examples for Code Quality on the same systems and distributions as those I used in Code Reading. I reasoned that it was important to provide continuity between the two volumes, allowing the reader to see how the same source code can be read to discern the functional, architectural, and design characteristics covered in Code Reading and the nonfunctional characteristics covered in Code Quality.
The code used in this book comes from code snapshots that are now mostly only of historic value. This has, however, provided me with the opportunity to show real security vulnerabilities, synchronization problems, portability issues, misused api calls, and other bugs that were identified and fixed in more recent versions. The code base's age makes it likely that its authors by now either have advanced to management positions where reading books as this one is frowned upon or have an eyesight unable to deal with this book's fonts. These changes conveniently provide me with a free license to criticize code without fear of nasty retributions. Nevertheless, I understand that I can be accused of disparaging code that was contributed by its authors in good faith to further the open source movement and to be improved upon rather than bemerely criticized. I sincerely apologize in advance if my comments cause any offense to a source code author. In defense, I argue that in most cases, the comments do not target the particular code excerpt but rather use it to illustrate a practice that should be avoided. Often the code I am using as a counterexample is a sitting duck, as it was written at a time when technological and other restrictions justified the particular coding practice, or the particular practice is criticized out of the context. In any case, I hope that the comments will be received good-humoredly and openly admit that my own code contains similar, and probably worse, misdeeds.
I chose all the systems used in the book's examples for practical reasons having to do with the suitability of the code as an instructional vehicle. Things I looked for were code quality, structure, design, utility, popularity, and a license thatwould not makemy publisher nervous. I strived to balance the selection of languages, actively looking for suitable Java and C++ code. However, where similar concepts could be demonstrated using different languages, I chose to use C as the least common denominator. Thus, 61% of the code references in the book are to C code; these include examples related to programming in the small (applicable to any language) and systems programming (which is done mostly in C). Another 19% of the examples refer to Java code. I chose to use Java code to demonstrate elements associated with object-oriented concepts and the corresponding apis. Most of these concepts also apply...