Customer Reviews


17 Reviews
5 star:
 (13)
4 star:
 (1)
3 star:
 (2)
2 star:
 (1)
1 star:    (0)
 
 
 
 
 
Average Customer Review
Share your thoughts with other customers
Create your own review
 
 
Only search this product's reviews

The most helpful favorable review
The most helpful critical review


9 of 9 people found the following review helpful:
5.0 out of 5 stars Great Book with Lots of Information
This book covers every aspect I could ever hope a book on web bots would cover. It goes into great detail and provides lots of background information about things such as why you should use web bots, security issues, how to authenticate a bot with password protected sites, writing search engine crawlers, parsing HTML, how to handle cookies, HTTP headers, dealing with...
Published on August 25, 2007 by D. Herbert

versus
32 of 35 people found the following review helpful:
2.0 out of 5 stars Does the basics.
"Webbots, Spiders, adn Screen Scrapers" is a solid book for building basic scripts to do web scraping. Michael Schrenk goes covers the "should you do this" aspect very well, and devotes much of the book to these kinds of topics. On that reason alone I give him major kudos, "just because you CAN do a thing, doesn't mean you SHOULD."

Technically the book and...
Published on December 5, 2007 by Brian


‹ Previous | 1 2 | Next ›
Most Helpful First | Newest First

32 of 35 people found the following review helpful:
2.0 out of 5 stars Does the basics., December 5, 2007
By 
Brian "eateroftheham" (Crown Point, IN United States) - See all my reviews
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
"Webbots, Spiders, adn Screen Scrapers" is a solid book for building basic scripts to do web scraping. Michael Schrenk goes covers the "should you do this" aspect very well, and devotes much of the book to these kinds of topics. On that reason alone I give him major kudos, "just because you CAN do a thing, doesn't mean you SHOULD."

Technically the book and examples are very basic and beginner level. All code is procedural and has absolutely no references to object oriented programming at all. This is great for a simple project, but building anything larger than a targetted webbot or two is beyond the scope of this book.

I was very dismayed at Mr. Schrenk's opinion of regular expressions:
"The use of regular expressions is a parsing language in itself, and most modern programming languages support aspects of regular expressions. In the right hands, regular expressions are also useful for parsing and substituting text; however, they are famous for thier sharp learning curve and cryptic syntax. I avoid regular expressions whenever possible."

This disregard for regular expressions effectively wipes out a powerful toolset for budding developers. Regular expressions are no harder to learn than PHP. The reasons for his disdain for them is also flawed:

"The regular expression engine used by PHP is not as efficient as engines used in other languages, and is certainly less efficient than PHP's built-in functions for parsing HTML."

PHP uses the same regular expression engine used (very effectively) in PERL with the use of the preg_* functions. There has been many studies that show preg_* style expressions outperform basic text matching in PHP. In this assesment the author is terribly wrong.

The book does a great job of explaining how to make single use scripts for scraping, but never how to create a larger infrastructure. There is no focus on creating multi process engines with pcntl_fork(), or proc_open(), these are critical for scaling web scraping applications. A single script scraping a few hundred websites on a single thread would take ages over a multi-threaded engine.

If you are looking to break into web scraping and not sure where to start, this is likely the best (and possibly only) book on the market. If you are intermediate or advanced you will quickly question the author's logic and see that scaling will become the number one issue you have to over come.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


24 of 26 people found the following review helpful:
3.0 out of 5 stars Solid introduction to webbots, with a catch., April 27, 2007
By 
Paul M. Reinheimer "Author" (Montréal, Quebec, Canada) - See all my reviews
(REAL NAME)   
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
I picked up this book full of enthusiasm, spiders are just plain cool, they go out and start downloading data for you, reading webpages, and even understanding them a little. My enthusiasm was dashed a little however on page four: You may use any of the scripts in this book for your own personal use, as long as you agree not to redistribute them... and agree not to sell or create derivative products under any circumstances.. I develop in PHP professionally, and a lot of the code I write ends up getting used somewhere with some sort of a for-profit basis, which pretty effectively prevents me from using any code between the covers (at its strictest reading, Im not sure I can even change the code).

The book does a great job of introducing different sorts of web agents that you can create programatically (more than just spiders) and introduces all sorts of interesting projects along those lines. Throughout the book a series of libraries written by the author are leveraged to make the retrieval and parsing of the various pages much easier. While newer developers will enjoy being able to concentrate on the big picture I found myself itching for more information on the nitty gritty.

Some of the projects explored include: price monitoring, image capturing (want to be your own google image search? :) ), link verification, spiders, and snipers. Each of the different projects received its own chapter, and effectively covered a lot of the topics covered within.

Overall, I would recommend this book to beginner to intermediate PHP developers looking to tackle the world of web agents, its a good primer on the related topics, and at the very least will give you some ideas on the complexities involved. As their skill grows they will probably find them-self either moving past the libraries included with the book, or modifying them greatly. My biggest complaint is the lack of coverage on the robots.txt file, some talk is given to it in terms of blocking robots from your own site, but I didnt see any code that actually dealt with parsing it for your own robot.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


9 of 9 people found the following review helpful:
5.0 out of 5 stars Great Book with Lots of Information, August 25, 2007
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
This book covers every aspect I could ever hope a book on web bots would cover. It goes into great detail and provides lots of background information about things such as why you should use web bots, security issues, how to authenticate a bot with password protected sites, writing search engine crawlers, parsing HTML, how to handle cookies, HTTP headers, dealing with forms and a lot more.

I was very pleased with how this book covered concepts. The book uses PHP and the cURL library as a teaching tool instead of trying to give a lesson in how to use PHP as a crawler language. The way the code is explained makes it very easy to translate into whatever language you are most comfortable coding in. The book uses fundamental functional programming concepts which make it easy to pick up the general idea without actually knowing PHP.

My boss bought this book to help my group us with a project we were working on, and even my co-workers who had no background with PHP were able to use this book to write a web bot in C# (using the cURL library) very easily. The concepts from this book easily transfered over to object-oriented concepts.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


18 of 21 people found the following review helpful:
5.0 out of 5 stars WOW, WOW, WOW! I'll say it again...WOW!, April 13, 2007
By 
Amazon Verified Purchase(What's this?)
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
I waited months for this book to come out and the wait was worth it. This is a great introduction to webbots , spiders and scrapers. The writing is easy and never boring. Lots of code examples and resources to tap into. I couldn't put it down. When was the last time you got a computer book that made you run to the keyboard to try something out?

I'm sure there will be some a#$h@#e that will say it's too rudimentory. It's an intro and it takes you up to intermediate and explains stuff about PHP that I didn't even know existed. Definitely worth the money. I can't wait for the sequel.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


2 of 2 people found the following review helpful:
5.0 out of 5 stars barry naice!, January 14, 2008
Amazon Verified Purchase(What's this?)
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
This book is simply awesome. You will need to come armed with at least a basic knowledge of php, but everything is pretty straight forward. The projects are well explained and applicable to a wide range of projects that you might be getting yourself into.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


1 of 1 people found the following review helpful:
4.0 out of 5 stars Great book!, August 31, 2010
Amazon Verified Purchase(What's this?)
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
If you want to 'automate' your browsing then this is a great book, with examples for every conceivable application. My only grumble is that, for me at least, it needs a chapter giving the step by step installation process for PHP/CURL so as to get up and running quickly.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


1 of 1 people found the following review helpful:
5.0 out of 5 stars Best for this subject, February 12, 2010
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
The power of this book is not so much in it's code examples but rather in it's ability to change your perspective. We are all aware that the Internet is a client-server topology, but what does that really mean? Reading the first few chapters gave me a whole new viewpoint of the Internet and what I could do with it, or to it. In the year since I first read it, I have stopped developing websites and now code web agents exclusively. It's amazing the number of uses they fulfill.

The code in the book is basic, not fit for production (the author tells you this) but it is invaluable to teach the theory and fundamentals of CURL. If you use the code and the provided website to practice with, you will soon be able to develop your own code library. Scale is also left to you to figure out. The obvious first step is a database and a NAS. Start small and use this book for the invaluable reference it is.

I really have to rate this book as one of my most influential reads of the last few years.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


1 of 1 people found the following review helpful:
5.0 out of 5 stars Great introduction, October 8, 2009
By 
Amazon Verified Purchase(What's this?)
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
This is a great introduction on the subject. The supplied PHP library does all the work.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


1 of 1 people found the following review helpful:
5.0 out of 5 stars This book is useful, January 25, 2009
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
This book is not like very algorithmic, but you can know the basic of webbots writing and some techniques involved. curl is good for starters, but it is the ideas instead of the codes that help us understand the concept. What you need to do is not copying the code, but to study what it does and why things are implemented.

Good book. 5/5
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


1 of 1 people found the following review helpful:
5.0 out of 5 stars Great Basic Book, December 2, 2008
By 
Joe Todd (Ridgefield, WA United States) - See all my reviews
(REAL NAME)   
This review is from: Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL (Paperback)
Need to learn how to browse the web with your own software instead of manually browsing? The is the best book on the subject. Written for people new to writing webbots, the example code is straightforward. A basic understanding of php is sufficient for understanding the examples.

Michael Schrenk takes you directly to the point of the book with fully explained examples. They are specific-use scripts, which makes them easy to learn from. With an understanding of the basics, you can combine and extend the sample projects to build larger multi-purpose webbots on your own. The example scripts can be tested against the authors' web site to ensure consistent results.

Most of the material naturally deals with browser emulation. In addition, there are chapters on POP3 mail servers interfaces, FTP webbots, and NNTP newsgroup interfaces.

This is a great basic book that will take you from curiosity to a working knowledge of webbot authoring in a short time period.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


‹ Previous | 1 2 | Next ›
Most Helpful First | Newest First

This product

Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL
Used & New from: $30.51
Add to wishlist See buying options