Sell Back Your Copy
For a $1.25 Gift Card
Trade in
Have one to sell? Sell yours here
Programming Spiders, Bots, and Aggregators in Java
 
 
Tell the Publisher!
I'd like to read this book on Kindle

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

Programming Spiders, Bots, and Aggregators in Java [Paperback]

Jeff Heaton (Author)
3.7 out of 5 stars  See all reviews (11 customer reviews)


Available from these sellers.


Textbook Student FREE Two-Day Shipping for students on millions of items. Learn more


Book Description

0782140408 978-0782140408 February 2002
The content and services available on the web continue to be accessed mostly through direct human control. But this is changing. Increasingly, users rely on automated agents that save them time and effort by programmatically retrieving content, performing complex interactions, and aggregating data from diverse sources. Programming Spiders, Bots, and Aggregators in Java teaches you how to build and deploy a wide variety of these agents-from single-purpose bots to exploratory spiders to aggregators that present a unified view of information from multiple user accounts.

You will quickly build on your basic knowledge of Java to quickly master the techniques that are essential to this specialized world of programming, including parsing HTML, interpreting data, working with cookies, reading and writing XML, and managing high-volume workloads. You'll also learn about the ethical issues associated with bot use--and the limitations imposed by some websites.

This book offers two levels of instruction, both of which are focused on the library of routines provided on the companion CD. If your main concern is adding ready-made functionality to an application, you'll achieve your goals quickly thanks to step-by-step instructions and sample programs that illustrate effective implementations. If you're interested in the technologies underlying these routines, you'll find in-depth explanations of how they work and the techniques required for customization.


Editorial Reviews

From the Back Cover

The content and services available on the web continue to be accessed mostly through direct human control. But this is changing. Increasingly, users rely on automated agents that save them time and effort by programmatically retrieving content, performing complex interactions, and aggregating data from diverse sources. Programming Spiders, Bots, and Aggregators in Java teaches you how to build and deploy a wide variety of these agents–from single-purpose bots to exploratory spiders to aggregators that present a unified view of information from multiple user accounts.

You will quickly build on your basic knowledge of Java to quickly master the techniques that are essential to this specialized world of programming, including parsing HTML, interpreting data, working with cookies, reading and writing XML, and managing high-volume workloads. You’ll also learn about the ethical issues associated with bot use--and the limitations imposed by some websites.

This book offers two levels of instruction, both of which are focused on the library of routines provided on the companion CD. If your main concern is adding ready-made functionality to an application, you’ll achieve your goals quickly thanks to step-by-step instructions and sample programs that illustrate effective implementations. If you’re interested in the technologies underlying these routines, you’ll find in-depth explanations of how they work and the techniques required for customization.

About the Author

Jeff Heaton is an author, college instructor, programmer, and Internet entrepreneur. He has worked with many languages, including C++, Java, and Visual Basic. He coauthored SAMS' Teach Yourself Visual C++ 6.0 Professional Reference Edition and has written for Java Developer's Journal, Windows Developer's Journal, and C++ Users Journal. He teaches Java programming at St. Louis Community College and has served as a consultant programmer for Anheuser-Busch, MasterCard, and Boeing, among others.

Product Details

  • Paperback: 512 pages
  • Publisher: Sybex (February 2002)
  • Language: English
  • ISBN-10: 0782140408
  • ISBN-13: 978-0782140408
  • Product Dimensions: 8.6 x 7.9 x 1.2 inches
  • Shipping Weight: 2.1 pounds
  • Average Customer Review: 3.7 out of 5 stars  See all reviews (11 customer reviews)
  • Amazon Best Sellers Rank: #1,058,704 in Books (See Top 100 in Books)

More About the Author

Jeff Heaton is an author, consultant, artificial intelligence (AI) researcher and former college instructor. Heaton has penned more than a dozen books on topics including AI, virtual worlds, spiders and bots. Heaton leads the Encog project, an open source initiative to provide an advanced neural network and bot framework for Java and C#. A Sun Certified Java Programmer and a Senior Member of the IEEE, he holds a Masters Degree in Information Management from Washington University in St. Louis. Heaton lives in St. Louis, Missouri.

 

Customer Reviews

11 Reviews
5 star:
 (5)
4 star:
 (2)
3 star:    (0)
2 star:
 (4)
1 star:    (0)
 
 
 
 
 
Average Customer Review
3.7 out of 5 stars (11 customer reviews)
 
 
 
 
Share your thoughts with other customers:
Most Helpful Customer Reviews

25 of 26 people found the following review helpful:
2.0 out of 5 stars not for serious programmers, June 19, 2003
By A Customer
This review is from: Programming Spiders, Bots, and Aggregators in Java (Paperback)
The code presented in this book is painful to look at. For one thing, the author is not familiar with basic Java coding conventions and continues to use C conventions instead.

In addition to not knowing proper coding conventions, this guy has no clue about writing Java UIs - the code listed in this book actually has Visual Cafe tags all over the place!

As far as info regarding spiders/bots/aggregators - there is decent high level overview info in this book, but nothing for a real programmer. You will not learn how to build these things on your own, and the book relies on the helper libraries included on the cd-rom to accomplish anything. If you are hoping to build anything useful after purchasing this book, understand that you will only succeed if you include the com.heaton.* libraries included on the cd.

Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


12 of 13 people found the following review helpful:
4.0 out of 5 stars Lots of working code but not much of a tutorial, July 16, 2006
This review is from: Programming Spiders, Bots, and Aggregators in Java (Paperback)
Bots are the simplest form of Internet-aware programs in that they simply carry out a repetitive task once unleashed on the web. A spider travels the web in a complex fashion, moving from one part of the World Wide Web to another collecting information from one site and then jumping to another based on that information. An aggregator is a bot that is designed to log into several user accounts and retrieve similar information.

If you need a complete bot, spider, or aggregator written in Java, complete with source code and a detailed manual about that source code so that you can customize it to suit your needs, this is a five star book. However, if you are looking for a book about information storage and retrieval and network programming that focuses on the theory of operation of such software with application code written in Java, you will be sorely disappointed.

The author did such a fine job of documenting his work with excellent diagrams, comments, and the book that reads like a user's manual, that I easily took his Web spider code and modified it to perform many additional tasks that his basic package does not do. All of the hooks are available in his code for you to modify it to collect or examine just about any kind of data accessible via the web.

I highly recommend this book if you are taking an information storage and retrieval class and you would like to read and study something applied on spiders, bots, and aggregators versus the theory you get in most textbooks. Just understand you are getting code plus a user's manual, not a tutorial. You are definitely going to need other resources on Java network programming if you want to study, understand, or modify the included source code. I suggest the latest edition of "Java Network Programming" by Elliotte Rusty Harold for help with the network programming part of bots, spiders, and aggregates. I also suggest you look at "Spidering Hacks", which has many good ideas of features you can add to your web spider.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


12 of 13 people found the following review helpful:
2.0 out of 5 stars Misleading Title, December 22, 2003
By A Customer
This review is from: Programming Spiders, Bots, and Aggregators in Java (Paperback)
As another reviewer commented this book should be called using the com.heaton.bot package api reference. All you learn is how to use this package of java classes, not how to actually create spiders, bots or aggregators from the ground up. I feel the title is misleading for such an expensive book. The only way I will learn what I want is to read the authors source code - which btw is very ugly however functional.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No

Share your thoughts with other customers: Create your own review
 
 
 
Most Recent Customer Reviews









Only search this product's reviews



Inside This Book (learn more)
First Sentence:
The Internet is built of many related protocols, and more complex protocols are layered on top of system level protocols. Read the first page
Key Phrases - Statistically Improbable Phrases (SIPs): (learn more)
prime recognizer, bot programmer, url parameter specifies, user name null, hot programmer, qif file, parse cookies, callback class, bot classes, htpasswd file, http object, workload store, same host address, exclusion file, pos parameter, cookie processing, spider object, parsing classes, spider manager, synchronized public void, programming spiders, callback object, int pos, maximum body size, static public void
Key Phrases - Capitalized Phrases (CAPs): (learn more)
Jeff Heaton, Internet Explorer, United States, Under the Hood, Microsoft Access, Text Document, Examining the Hypertext Transfer Protocol, Programming Spiders, Fresno Street, Java Socket Programming, New York, Simple Mail Transfer Protocol, Hello World, Interpreting Data, Keep-Alive Cookie, Kimmswick Information, Los Angeles, New Orleans, Posting Forms, Babel Fish, Compile Examples Under Windows, Computer Store, Computer Stuff, Cookie Name, Customize Links
New!
Books on Related Topics | Concordance | Text Stats
Browse Sample Pages:
Front Cover | Table of Contents | First Pages | Index | Back Cover | Surprise Me!
Search Inside This Book:



Books on Related Topics (learn more)
 
Enterprise Java 2, J2EE 1.3 Complete by Greg Jarboe, Hollis Thomases, Mari Smith, Chris Treadaway Dave Evans
Java 2 by Greg Jarboe, Hollis Thomases, Mari Smith, Chris Treadaway Dave Evans
 

What Other Items Do Customers Buy After Viewing This Item?


Tags Customers Associate with This Product

 (What's this?)
Click on a tag to find related items, discussions, and people.
 
(4)
(2)

Your tags: Add your first tag
 

Sell a Digital Version of This Book in the Kindle Store

If you are a publisher or author and hold the digital rights to a book, you can sell a digital version of it in our Kindle Store. Learn more

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 


Active discussions in related forums
Search Customer Discussions
Search all Amazon discussions
   
Related forums



So You'd Like to...


Create a guide


Look for Similar Items by Category


Look for Similar Items by Subject