When you move beyond the browser to accessing the Internet from the command line a whole world of unexplored possibilities opens up. Michael Schrenk, the author of Webbots, Spiders, and Screen Scrapers starts with the basics and works through progressively more difficult examples as he details his craft. The Internet browser was a great invention. Yet much can be done on the Internet when one takes an automated approach using PHP scripts.
Beginning with the basics, Part 1 steps the reader through downloading web pages, basic parsing techniques and gives an overview of Regular expressions. The companion web site to the book provides a library of PHP scripts that can be used to experiment using sample projects outlined in Part 2. The author provides files on the companion site that allow the developer an opportunity to learn webbot design in a controlled environment.
I liked the fact that the author addresses real world business concerns regarding Webbots and their legal usage. Chapter 26 explores the design considerations for a stealthy webbot while maintaining a competitive business advantage. A chapter is also devoted to being a good web citizen when accessing someone's web site with a webbot. A poorly designed webbot could consume excessive bandwidth or disrupt someone's livelihood. A webbot developer must also consider the intellectual property of others. The author cautions developers to be respectful with the resources their webbots consume.
This book is a great resource for those looking to move beyond the Internet browser with automated solutions for collecting and using data. It should prove to stimulate your imagination with the possibilities of what can be done. We are also reminded that their is very little from a technological standpoint that distinguishes a harmful webbot from a beneficial one. It is up to the developer to create scripts that do no harm. I would recommend this book.
Disclosure: I received a free e-book copy for review purposes.