23 used & new from $14.28

Have one to sell? Sell yours here
 
 
Programming Spiders, Bots, and Aggregators in Java
 
 
Tell the Publisher!
I’d like to read this book on Kindle

Don’t have a Kindle? Get your Kindle here.
 
  

Programming Spiders, Bots, and Aggregators in Java (Paperback)

~ (Author) "The Internet is built of many related protocols, and more complex protocols are layered on top of system level protocols..." (more)
Key Phrases: prime recognizer, bot programmer, url parameter specifies, Jeff Heaton, Internet Explorer, United States (more...)
3.7 out of 5 stars  See all reviews (11 customer reviews)


Available from these sellers.


6 new from $51.23 17 used from $14.28

Customers Who Bought This Item Also Bought

Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL

Webbots, Spiders, and Screen Scrapers: A Guide to Developing Internet Agents with PHP/CURL

by Michael Schrenk
HTTP Programming Recipes for Java Bots

HTTP Programming Recipes for Java Bots

by Jeff Heaton
4.0 out of 5 stars (1)  $31.49
Spidering Hacks

Spidering Hacks

by Kevin Hemenway
4.4 out of 5 stars (14)  $16.47
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)

Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)

by Bing Liu
4.3 out of 5 stars (3)  $42.69
Web Content Mining with Java

Web Content Mining with Java

by Tony Loton
5.0 out of 5 stars (3)  $64.00
Explore similar items

Editorial Reviews

Product Description

Spiders, bots, and aggregators are all so-called intelligent agents, which execute tasks on the Web without the intervention of a human being. Spiders go out on the Web and identify multiple sites with information on a chosen topic and retrieve the information. Bots find information within one site by cataloging and retrieving it. Aggregrators gather data from multiple sites and consolidate it on one page, such as credit card, bank account, and investment account data. This book offer offers a complete toolkit for the Java programmer who wants to build bots, spiders, and aggregrators. It teaches the basic low-level HTTP/network programming Java programmers need to get going and then dives into how to create useful intelligent agent applications. It is aimed not just at Java programmers but JSP programmers as well. The CD-ROM includes all the source code for the author's intelligent agent platform, which readers can use to build their own spiders, bots, and aggregators.


From the Back Cover

The content and services available on the web continue to be accessed mostly through direct human control. But this is changing. Increasingly, users rely on automated agents that save them time and effort by programmatically retrieving content, performing complex interactions, and aggregating data from diverse sources. Programming Spiders, Bots, and Aggregators in Java teaches you how to build and deploy a wide variety of these agents-from single-purpose bots to exploratory spiders to aggregators that present a unified view of information from multiple user accounts.

You will quickly build on your basic knowledge of Java to quickly master the techniques that are essential to this specialized world of programming, including parsing HTML, interpreting data, working with cookies, reading and writing XML, and managing high-volume workloads. You'll also learn about the ethical issues associated with bot use--and the limitations imposed by some websites.

This book offers two levels of instruction, both of which are focused on the library of routines provided on the companion CD. If your main concern is adding ready-made functionality to an application, you'll achieve your goals quickly thanks to step-by-step instructions and sample programs that illustrate effective implementations. If you're interested in the technologies underlying these routines, you'll find in-depth explanations of how they work and the techniques required for customization.


Product Details

  • Paperback: 512 pages
  • Publisher: Sybex (February 2002)
  • Language: English
  • ISBN-10: 0782140408
  • ISBN-13: 978-0782140408
  • Product Dimensions: 8.6 x 7.9 x 1.2 inches
  • Shipping Weight: 2.1 pounds
  • Average Customer Review: 3.7 out of 5 stars  See all reviews (11 customer reviews)
  • Amazon.com Sales Rank: #978,764 in Books (See Bestsellers in Books)

More About the Author

Jeff Heaton
Discover books, learn about writers, read author blogs, and more.

Visit Amazon's Jeff Heaton Page

Inside This Book (learn more)




What Do Customers Ultimately Buy After Viewing This Item?


Tags Customers Associate with This Product

 (What's this?)
Click on a tag to find related items, discussions, and people.
 
(4)
(2)

Your tags: Add your first tag
 

Sell a Digital Version of This Book in the Kindle Store

If you are a publisher or author and hold the digital rights to a book, you can sell a digital version of it in our Kindle Store. Learn more

 

Customer Reviews

11 Reviews
5 star:
 (5)
4 star:
 (2)
3 star:    (0)
2 star:
 (4)
1 star:    (0)
 
 
 
 
 
Average Customer Review
3.7 out of 5 stars (11 customer reviews)
 
 
 
 
Share your thoughts with other customers:
Most Helpful Customer Reviews

 
20 of 21 people found the following review helpful:
2.0 out of 5 stars not for serious programmers, June 19, 2003
By A Customer
The code presented in this book is painful to look at. For one thing, the author is not familiar with basic Java coding conventions and continues to use C conventions instead.

In addition to not knowing proper coding conventions, this guy has no clue about writing Java UIs - the code listed in this book actually has Visual Cafe tags all over the place!

As far as info regarding spiders/bots/aggregators - there is decent high level overview info in this book, but nothing for a real programmer. You will not learn how to build these things on your own, and the book relies on the helper libraries included on the cd-rom to accomplish anything. If you are hoping to build anything useful after purchasing this book, understand that you will only succeed if you include the com.heaton.* libraries included on the cd.

Comment Comment | Permalink | Was this review helpful to you? Yes No (Report this)



 
9 of 9 people found the following review helpful:
4.0 out of 5 stars Lots of working code but not much of a tutorial, July 16, 2006
By calvinnme "Texan refugee" (Fredericksburg, Va) - See all my reviews
(TOP 10 REVIEWER)      
Bots are the simplest form of Internet-aware programs in that they simply carry out a repetitive task once unleashed on the web. A spider travels the web in a complex fashion, moving from one part of the World Wide Web to another collecting information from one site and then jumping to another based on that information. An aggregator is a bot that is designed to log into several user accounts and retrieve similar information.

If you need a complete bot, spider, or aggregator written in Java, complete with source code and a detailed manual about that source code so that you can customize it to suit your needs, this is a five star book. However, if you are looking for a book about information storage and retrieval and network programming that focuses on the theory of operation of such software with application code written in Java, you will be sorely disappointed.

The author did such a fine job of documenting his work with excellent diagrams, comments, and the book that reads like a user's manual, that I easily took his Web spider code and modified it to perform many additional tasks that his basic package does not do. All of the hooks are available in his code for you to modify it to collect or examine just about any kind of data accessible via the web.

I highly recommend this book if you are taking an information storage and retrieval class and you would like to read and study something applied on spiders, bots, and aggregators versus the theory you get in most textbooks. Just understand you are getting code plus a user's manual, not a tutorial. You are definitely going to need other resources on Java network programming if you want to study, understand, or modify the included source code. I suggest the latest edition of "Java Network Programming" by Elliotte Rusty Harold for help with the network programming part of bots, spiders, and aggregates. I also suggest you look at "Spidering Hacks", which has many good ideas of features you can add to your web spider.
Comment Comment | Permalink | Was this review helpful to you? Yes No (Report this)



 
8 of 8 people found the following review helpful:
5.0 out of 5 stars Create a Object Oriented Bot Package Step by Step, April 25, 2004
By A Customer
I use this book as a supplement to a class that I teach, as it gives the students the necessary stills to programmatically spider, and generally access, information on the Net.

As some of the other reviewers point out, this book does center around the creation of a "bot package". However, I see this as one of the book's greatest strengths. The author explains step by step how to take basic concepts, continually build upon them, progressing onward to more complex spiders and bots. Specifically:

1. Create an advanced HTTP object that overcomes many of the shortcomings of the one which is built into Java. (namely cookie support, referrer support, HTTP authentication, and more)
2. Add forms/page processing on top of the HTTP object. You are shown step by step how to process the data you collect from step 1.
3. Create a bot that wields the page/form processing created in step 2.
4. Create a spider, that, using steps 1-3, can access pages across an entire site.
5. Expand the spider to support thread pooling and a JDBC database.

Rather than providing a bunch of disjoint code samples, like many books do. The author guides you step by step through the above path, revealing the techniques at every step. For the reader who does not care about the intricate nature of bot programming, sadly, some of my students. You can skip to the API documentation and get right onto creating your own bots. You can also download updated versions of the "bot package" from the author's site. I actually did this before buying the book.

The downsides to the book are the example programs use of GUI's. I would rather every example had been straight console, the GUI only gets in the way, for a book targeting bot programming. Also the author very annoyingly putting an underscore in front of every class-instance variable, which gives some of the code something of a C++ look I suppose.

If you are already programming bots and spiders of your own, I don't think you will get much more from this book than you are likely already doing.

But for someone who wants to get started in this exciting area, there is nothing else like it, and I highly recommend it.

Comment Comment | Permalink | Was this review helpful to you? Yes No (Report this)


Share your thoughts with other customers: Create your own review
 
 
 
Most Recent Customer Reviews

2.0 out of 5 stars Not much information for such a long book
The essence of this book could probably have been compressed into a few chapters. I read the whole thing in about a day, skimming over many sections (e.g. Read more
Published on June 24, 2004 by Edward J Garrett

2.0 out of 5 stars Misleading Title
As another reviewer commented this book should be called using the com.heaton.bot package api reference. Read more
Published on December 22, 2003

5.0 out of 5 stars happy
Visual Cafe produces the Swing so one can view the examples from the book. So what?

When beginning to program with HTTP protocols, it's easy to enter incorrect methods and... Read more

Published on November 6, 2003

2.0 out of 5 stars very limited usefulness
This book is primarily a users guide for the libraries provided on the cd-rom. If you are looking for the information necessary to write spiders, bots and aggregators by hand,... Read more
Published on June 25, 2003

5.0 out of 5 stars A great example of how to present highly technical material
I've read MANY technical books over my 23 year career in IT and would say this this book is one of my top 3 favorites. Read more
Published on February 3, 2003 by James Kilthau

5.0 out of 5 stars Incredible Book!
This book is simply one of the best computer books I have ever read! The author doesnt just cover programming like a lot of programming books do, he also explains how technology... Read more
Published on May 16, 2002

4.0 out of 5 stars Good work - quite comprehensive!
This is quite a complete book for anyone wishing to write spiders in any language (if he knows Java well enough to understand the examples). Read more
Published on March 20, 2002 by sankalp

5.0 out of 5 stars Great Book!
This just came out March 2002. I'm very impressed. It's a complete intro to Bots in Java. The CD has a bot.jar package that will really come in handy. Read more
Published on March 7, 2002

Only search this product's reviews



Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 

Search Customer Discussions
Search all Amazon discussions
   



So You'd Like to...


Create a guide

Product Information from the Amapedia Community

Beta (What's this?)


Look for Similar Items by Category


Look for Similar Items by Subject

 

Feedback

If you need help or have a question for Customer Service, contact us.
 Would you like to update product info or give feedback on images?
Is there any other feedback you would like to provide?

Your comments can help make our site better for everyone.



Your Recent History

 (What's this?)

After viewing product detail pages or search results, look here to find an easy way to navigate back to pages you are interested in.