Optimizing the performance of computer systems has always been an art relegated to a few individuals who happen to have the "right skills." UNIX systems have not escaped this syndrome. It is rare to find anyone who knows how to instrument the system, let alone tune it. This is by no means a fault of the general user community. The problem turns out to be rather complex, requiring good knowledge of computer architecture, UNIX design, and performance-monitoring tools.
Due to a lack of standards in the system performance management area, vendors often take liberties with substituting, enhancing, or altogether removing system-monitoring tools. Even when a familiar command does exist on a system, it may have subtle differences that can easily mislead you. One such example is the unit for some of the fields. In a typical manual page, you see frequent references to units of "blocks" or "pages." Yet there rarely is an indication of how big these things are. As you will see later in this book, a page can be anywhere from 512 bytes to 8 kilobytes, making it very hard to interpret such data correctly.
Beyond the tools, there are also a number of limitations in the UNIX architecture itself. Without knowing about these deficiencies, you could easily chase the wrong problem. A classic example is when people blame the hardware instead of UNIX and vice verse. In the end, we hope that you do not misinterpret our criticisms of one of the best operating systems around. Perhaps our only excuse for pointing out these deficiencies stems from a wise saying that states:
If you cannot criticize something, you do not understand it well enough!
I.1 General Style
In this book, we take a system approach to performance optimization by covering everything from user applications all the way down to the hardware. At the same time, we try not to assume that you have a strong background in either hardware architecture re or UNIX internals or, for that matter, extensive experience with UNIX itself. Just in case you have dabbled seriously in any of these areas, we explain each topic in a separate chapter, making it easy to skip over them. You will probably also notice that we have dedicated considerably more space to analysis than to simple cookbook procedures. While cookbook procedures do have their place (and we have included a fair number in this text), they do not have any use unless you know when to use them. Armed with an in-depth knowledge of what is going on inside your system, you will be better able to identify the true nature of performance bottlenecks in your system. As a bonus, you will be in a position to solve a wider set of problems than what is covered here. In a departure from other texts on this topic, we have taken a very pragmatic view by emphasizing modern techniques for tuning UNIX systems. Had this book been written in the early 1980s, we would have focused heavily on how to modify the operating system parameters to either squeeze the last byte out of it or save a few CPU cycles. The advice would have been sound in that time frame due to the fact that the average machine was well under 5 MIPS and had around 8 megabytes of memory. Any amount of savings would have seemed significant. Current CPUs are orders of magnitude faster with tens or even hundreds of megabytes of memory. The result is that the benefits of many of these optimization techniques are simply "lost in the noise." So, rather than relying on obsolete advice, we focus on higher-level approaches to system optimization. These tacks include optimization of the system hardware, general techniques for resource utilization, and more optimal usage of the system and network. Alas, old habits die hard, and users have a fondness for "poking" values into their system. For this reason, we also cover those parameters and tuning methods that have at least some noticeable impact on the system performance. But we would like to recommend again that you stay away from them if for no other reason than portability. Higher-level techniques work across different UNIX implementations and, for that matter, other operating systems. With their larger impact on system throughput and response time, they are also more rewarding to implement.
We start this book by covering the basic principles behind performance monitoring and optimization. They are helpful in forming a strategy for attacking performance problems and steering you clear of potential pitfalls. Although the information presented in this chapter may seem simple in nature, its impact is significant.
Chapter 2 is aimed at giving you a pragmatic overview of the hardware architecture. We are not too worried about the theoretical aspects of this field about which there are many excellent texts. Instead, we cover the major components in a high-performance computer system and show how design decisions made by the system and chip vendors have an impact on the performance of your system. The information should help you determine when a performance problem is a result of the inherent design of the hardware and not UNIX.
Chapter 3 is dedicated to an architectural overview of modern implementations of UNIX as it relates to system performance. Our focus is not to teach you the entire operating system (which would occupy a book larger than this one) but to point out those aspects that have an impact on monitoring and optimization of the system. As a result, the topics are presented in fairly terse form, which may be hard to understand. We have made sure, however, that all the necessary facts are highlighted so that complete understanding of the material is not necessary.
Armed with basic knowledge of the hardware and UNIX, you are now ready to start instrumenting your system and to look for performance bottlenecks. We have opted to divide the material into two chapters each dedicated to the traditional implementation s of UNIX today, namely System V and BSD. Alas, vendors routinely mix and match BSD and System V tools, so it may be necessary to read both chapters. To make it easier, we have listed the tools available in most common versions of UNIX in Table I.1 a long with the relevant chapter in this book. Because there are still large number of users connected to UNIX systems through serial ports and ASCII terminals, we have dedicated.
Chapter 6 to UNIX terminal support. Because the same code deals with modems and sometimes networking, the information should also be useful to those who use workstations and other UNIX systems. Also included is the coverage of tools that let you instrument the terminal subsystem. Once you find the system bottlenecks by using the monitoring tools, it is time to eliminate or reduce their impact on system performance.
Chapter 7 covers the best techniques for dealing with typical shortages such as memory, disk bandwidth, and CPU resources. Again, we cover both high-level techniques for reconfiguration of the system and detailed fine-tuning of each subsystem.
Because it is rare to find UNIX systems that run stand-alone these days, Chapter 8 focuses on basic UNIX networking. This includes the complete suite of TCP/IP along with coverage of various networks and topologies. Because the networking implementation in UNIX is very monolithic with very little room for fine-tuning, we have focused the material on best ways to configure the network and system to avoid performance problems at the start.
Given the widespread usage of NFS, we have dedicated Chapter 9 to its operation and optimization techniques. We point out some major deficiencies in the NFS design and ways to side step them.
The X window system is covered in Chapter 10 starting with an in-depth overview of its architecture. True to our form, we point out its deficiencies as implemented on top of UNIX. Even though X is not generally tunable, we have nevertheless uncovered a few techniques for optimizing it.
Computer marketing is full of buzz words describing the speeds and feeds of various components of the system. Invariably, these terms are derived from some set of benchmarks. To prepare you for your next computer purchase, Chapter 11 covers the most popular industry standard benchmarks. We not only describe what the benchmarks purport to measure but also what the results actually reflect. Because benchmarks are based on pieces of code that are bound to be different than your application, we also attempt to correlate the results to real-life applications.
Chapter 12 is dedicated to the ins and outs of selecting systems and hardware components for best performance. We cover a broad range of systems from PCs to high-end RISC systems. With the information in this chapter, you should be able to select the best hardware for your application so that performance problems do not surface later.
Chapter 13, which covers optimization of the UNIX programs, may seem out of place in such a text. However, these techniques give you an additional and powerful tool in getting the most performance out of your system and applications. We cover the standared UNIX profiling and timing tools, which help you identify what parts of an application can benefit from optimization. This discussion is followed by some common techniques for speeding up typical code sequences and algorithms. The coverage remains brief in this area because of the necessity of keeping from filling the entire text. References are provided, however, for those interested in more detailed information.
Throughout this book, we use the System V and SysV designations in reference to all variants of System V from Release 3.2 to 4.X. Even though these releases share many components, we make sure to point out if a feature is specific to a particular version of System V. A case in point is System V Release 4.X (commonly abbreviated to SVR4), which is quite a departure from older releases of System V. Unless we state otherwise, the SVR4 designation applies only to versions of UNIX that are "pure" imp lementations of System V Release 4.X.
As of this writing, Sun (with Solaris), SONY, TANDEM, NEC, Pyramid Technology, and Novell (with UnixWare) are some of the vendors that fall in this category. Others, such as SGI, have operating systems that are compatible with SVR4 from a user point of view, but their kernel does not necessarily match the SVR4 sources. Although there is nothing wrong with their approach, the algorithms in these operating systems may not match those used in SVR4.
Being fairly picky about preciseness of units, we use the designation Kbytes, Mbytes, Gbytes, and Tbytes to refer to kilobytes, megabytes, gigabytes, and terabytes, respectively. Likewise, Kbits, Mbits, Gbits, and Tbits refer to kilobits, megabits, g igabits, and terabits. We stay away from terms such as MB and Mb, which are easily confused with each other.