Enter your mobile number below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
Getting the download link through email is temporarily not available. Please check back later.

  • Apple
  • Android
  • Windows Phone
  • Android

To get the free app, enter your mobile phone number.

Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL 1st Edition

4.6 out of 5 stars 20 customer reviews
ISBN-13: 978-1593271206
ISBN-10: 1593271204
Why is ISBN important?
This bar-code number lets you verify that you're getting exactly the right version or edition of a book. The 13-digit and 10-digit formats both work.
Scan an ISBN with your phone
Use the Amazon App to scan ISBNs and compare prices.
Have one to sell? Sell on Amazon
Buy used
Condition: Used: Very Good
Comment: Unbeatable customer service, and we usually ship the same or next day. Over one million satisfied customers!
Access codes and supplements are not guaranteed with used items.
42 Used from $0.47
FREE Shipping on orders over $25.
More Buying Choices
20 New from $9.32 42 Used from $0.47

Windows 10 For Dummies Video Training
Get up to speed with Windows 10 with this video training course from For Dummies. Learn more.
click to open popover

Editorial Reviews

About the Author

Michael Schrenk uses webbots and data-driven web applications to create competitive advantages for businesses. He has written for Computerworld and Web Techniques magazines and has taught courses on Web usability and Internet marketing. He has also given presentations on intelligent Web agents and online corporate intelligence at the DEFCON hacker's convention.


New York Times best sellers
Browse the New York Times best sellers in popular categories like Fiction, Nonfiction, Picture Books and more. See more

Product Details

  • Paperback: 306 pages
  • Publisher: No Starch Press; 1 edition (March 30, 2007)
  • Language: English
  • ISBN-10: 1593271204
  • ISBN-13: 978-1593271206
  • Product Dimensions: 7 x 1 x 9.2 inches
  • Shipping Weight: 1.4 pounds
  • Average Customer Review: 4.5 out of 5 stars  See all reviews (20 customer reviews)
  • Amazon Best Sellers Rank: #1,512,527 in Books (See Top 100 in Books)

Customer Reviews

Top Customer Reviews

Format: Paperback Verified Purchase
"Webbots, Spiders, adn Screen Scrapers" is a solid book for building basic scripts to do web scraping. Michael Schrenk goes covers the "should you do this" aspect very well, and devotes much of the book to these kinds of topics. On that reason alone I give him major kudos, "just because you CAN do a thing, doesn't mean you SHOULD."

Technically the book and examples are very basic and beginner level. All code is procedural and has absolutely no references to object oriented programming at all. This is great for a simple project, but building anything larger than a targetted webbot or two is beyond the scope of this book.

I was very dismayed at Mr. Schrenk's opinion of regular expressions:
"The use of regular expressions is a parsing language in itself, and most modern programming languages support aspects of regular expressions. In the right hands, regular expressions are also useful for parsing and substituting text; however, they are famous for thier sharp learning curve and cryptic syntax. I avoid regular expressions whenever possible."

This disregard for regular expressions effectively wipes out a powerful toolset for budding developers. Regular expressions are no harder to learn than PHP. The reasons for his disdain for them is also flawed:

"The regular expression engine used by PHP is not as efficient as engines used in other languages, and is certainly less efficient than PHP's built-in functions for parsing HTML."

PHP uses the same regular expression engine used (very effectively) in PERL with the use of the preg_* functions. There has been many studies that show preg_* style expressions outperform basic text matching in PHP. In this assesment the author is terribly wrong.
Read more ›
5 Comments 38 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Paperback
I picked up this book full of enthusiasm, spiders are just plain cool, they go out and start downloading data for you, reading webpages, and even understanding them a little. My enthusiasm was dashed a little however on page four: “You may use any of the scripts in this book for your own personal use, as long as you agree not to redistribute them... and agree not to sell or create derivative products under any circumstances.”. I develop in PHP professionally, and a lot of the code I write ends up getting used somewhere with some sort of a for-profit basis, which pretty effectively prevents me from using any code between the covers (at its strictest reading, I’m not sure I can even change the code).

The book does a great job of introducing different sorts of web agents that you can create programatically (more than just spiders) and introduces all sorts of interesting projects along those lines. Throughout the book a series of libraries written by the author are leveraged to make the retrieval and parsing of the various pages much easier. While newer developers will enjoy being able to concentrate on the “big picture” I found myself itching for more information on the nitty gritty.

Some of the projects explored include: price monitoring, image capturing (want to be your own google image search? :) ), link verification, spiders, and snipers. Each of the different projects received it’s own chapter, and effectively covered a lot of the topics covered within.

Overall, I would recommend this book to beginner to intermediate PHP developers looking to tackle the world of web agents, it’s a good primer on the related topics, and at the very least will give you some ideas on the complexities involved.
Read more ›
1 Comment 24 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Paperback Verified Purchase
This book covers every aspect I could ever hope a book on web bots would cover. It goes into great detail and provides lots of background information about things such as why you should use web bots, security issues, how to authenticate a bot with password protected sites, writing search engine crawlers, parsing HTML, how to handle cookies, HTTP headers, dealing with forms and a lot more.

I was very pleased with how this book covered concepts. The book uses PHP and the cURL library as a teaching tool instead of trying to give a lesson in how to use PHP as a crawler language. The way the code is explained makes it very easy to translate into whatever language you are most comfortable coding in. The book uses fundamental functional programming concepts which make it easy to pick up the general idea without actually knowing PHP.

My boss bought this book to help my group us with a project we were working on, and even my co-workers who had no background with PHP were able to use this book to write a web bot in C# (using the cURL library) very easily. The concepts from this book easily transfered over to object-oriented concepts.
Comment 10 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Paperback Verified Purchase
I waited months for this book to come out and the wait was worth it. This is a great introduction to webbots , spiders and scrapers. The writing is easy and never boring. Lots of code examples and resources to tap into. I couldn't put it down. When was the last time you got a computer book that made you run to the keyboard to try something out?

I'm sure there will be some a#$h@#e that will say it's too rudimentory. It's an intro and it takes you up to intermediate and explains stuff about PHP that I didn't even know existed. Definitely worth the money. I can't wait for the sequel.
Comment 18 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse

Most Recent Customer Reviews