The Apache Web Server (commonly known as "Apache") is, by most measures, the leading server on the Web today. For ten years it has been the unrivaled and unchallenged market leader, with approximately 70 percent of all websites running Apache. It is backed by a vibrant and active development community that operates under the umbrella of the Apache Software Foundation (ASF), and it is supported by a wide range of people and organizations, ranging from giants such as IBM down to individual consultants.
The key characteristics of Apache are its openness and diversity. The source code is completely open: Not only the current version, but also past versions and experimental development versions can be downloaded by anyone from apache.org. The development process is also open, with the exception of a few matters dealing with project management. Apache's diversity is a reflection of its user and developer communities: It is equally at home in an ultra-high-volume site that receives tens of thousands of hits per second, a complex and highly dynamic web application, a bridge to a separate application server, or a simple homepage host. The inclusion of developers from such diverse roles helps ensure that Apache continues to serve all of these widely differing environments successfully.
Yet that doesn't mean Apache follows a one-size-fits-all approach. Its highly modular architecture is built on a small core, which enables every user to tailor it to meet his or her own specific needs. Apache serves equally well as a stand-alone webserver or a component in some other system. Most importantly, it is a highly flexible and extensible applications platform.
Audience and Readership
This book is intended for software developers who are working with the Apache Web Server. It is the first such book published since March 1999, and the first and (to date) only developer book that is relevant to Apache 2.
The book's primary purpose is to serve as an in-depth textbook for module developers working with Apache. The narrative and examples deal with development in C, and a working knowledge of C is assumed. However, the Apache architecture and API are shared by major scripting environments such as mod_perl and mod_python, as well as C. With the exception of Chapter 3 (on the Apache Portable Runtime), much of this book should also be relevant to developers working with scripting languages at any level more advanced than standard CGI. The current Apache release—version 2.2—is the primary focus of this book. Version 2.2.0 was released in December 2005 and, given Apache's development cycle, is likely to remain current for some time (the previous stable version 2.0 was released in April 2002). This book is also very relevant to developers who are still working with version 2.0 (the architecture and API are substantially the same across all 2.x versions), and is expected to remain valid for the foreseeable future.
Organization and Scope
This book comprises twelve chapters and three appendixes.
The first chapter is a nontechnical overview that sets the scene and introduces the social, cultural, and legal background of Apache. It is followed by an extended technical introduction and overview that is spread over the next three chapters. Chapter 2 is a technical overview of the Apache architecture and API. Chapter 3 introduces the Apache Portable Runtime (APR), a semi-autonomous library that is used throughout Apache and relieves the programmer of many of the traditional burdens of C programming. Chapter 4 discusses general programming techniques appropriate to working with Apache, to ensure that your modules work well across different platforms and environments, remain secure, and don't present difficulties to systems administrators.
The central part of the book moves from the general to the specific. Chapters 5-8 present detailed discussions of various aspects of the core function of a webserver-- namely, processing HTTP requests. A number of real-life modules are developed in these chapters. Chapter 5 starts with a "Hello World" example and takes you to the point where you can duplicate the function of a CGI or PHP script as a module. Chapter 6 describes the request processing cycle and working with HTTP metadata. Chapter 7 goes into more detail about identifying users and handling access control. Chapter 8 presents the filter chain and techniques for transforming incoming and outgoing data; it includes a thorough theoretical exposition and several examples. Chapter 9 completes the core topics by describing how to work with configuration data.
Chapters 10 and 11 present more advanced topics that are nevertheless essential reading for serious application developers. Chapter 10 looks at the mechanics of how the API works, and describes how a module can extend it or introduce an entirely new API or service for other modules. Chapter 11 presents the DBD framework for SQL database applications. Chapter 12 briefly discusses troubleshooting and debugging techniques.
The appendixes include Apache legal documents reproduced from the Web. They are extremely relevant to the book but were not written by the author. Appendix A is the Apache License. Appendix B includes the Contributor License Agreements, which cover issues related to intellectual property. Finally, the authoritative Hypertext Transfer Protocol (HTTP/1.1) standard (RFC 2616) is reproduced in full in Appendix C as reference documentation for developers of web applications.
What the Book Does Not Cover
This book is firmly focused on applications development, so it has very little to say about systems programming for or with Apache. In particular, if your goal is to port Apache to a hitherto-unsupported platform, the book offers no more than a pointer to the areas of code you'll need to work on.
Apart from that, there is one important omission: The book limits itself to considering Apache as a server for HTTP (and HTTPS), the protocol of the Web. Although the server can be used to support other protocols, and implementations already exist for FTP, SMTP, and echo, this book has nothing to say on the subject. Nevertheless, if you are looking to implement or work with another protocol, the overview and the discussion of HTTP protocol handling should help you get oriented.
Some of the modules used as examples are written especially for this book or similar instructional materials:
- Chapter 5: mod_helloworld
- Chapter 6: mod_choices (derived from a non-open-source module)
- Chapter 7: mod_authnz_day
- Chapter 8: mod_txt (written originally for http://www.apachetutor.org)
These modules can be downloaded from http://www.apachetutor.org.
All of the more substantial modules are taken from real-life sources. Except where otherwise indicated and referenced by URL, all modules are taken from either the Apache standard distribution (http://httpd.apache.org) or the author's company's site (http://apache.webthing.com). Please note that the use of any source code in this book does not imply a license to copy it other than for purely personal use. Please refer to the license terms in the original sources of each module.