or
Sign in to turn on 1-Click ordering.
or
Amazon Prime Free Trial required. Sign up when you check out. Learn More
Kindle Edition
Read instantly on your iPad, PC or Mac, no Kindle required
Buy Price: $45.62
Rent From: $18.64
 
 
 
Sell Back Your Copy
For a $3.55 Gift Card
Trade in
More Buying Choices
Have one to sell? Sell yours here
Mining the Web: Discovering Knowledge from Hypertext Data
 
 

Mining the Web: Discovering Knowledge from Hypertext Data [Hardcover]

Soumen Chakrabarti (Author)
4.8 out of 5 stars  See all reviews (9 customer reviews)

List Price: $85.95
Price: $51.20 & this item ships for FREE with Super Saver Shipping. Details
You Save: $34.75 (40%)
  Special Offers Available
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
In Stock.
Ships from and sold by Amazon.com. Gift-wrap available.
Only 9 left in stock--order soon (more on the way).
Want it delivered Monday, January 30? Choose One-Day Shipping at checkout. Details
Textbook Student FREE Two-Day Shipping for Students. Learn more

Formats

Amazon Price New from Used from
Kindle Edition
Rent from
$45.62
$18.64
 
Hardcover $51.20  
Sell Back Your Copy for $3.55
Whether you buy it used on Amazon for $19.87 or somewhere else, you can sell it back through our Book Trade-In Program at the current price of $3.55.
Used Price$19.87
Trade-in Price$3.55
Price after
Trade-in
$16.32

Book Description

1558607544 978-1558607545 October 23, 2002 1
Mining the Web: Discovering Knowledge from Hypertext Data is the first book devoted entirely to techniques for producing knowledge from the vast body of unstructured Web data. Building on an initial survey of infrastructural issues-including Web crawling and indexing-Chakrabarti examines low-level machine learning techniques as they relate specifically to the challenges of Web mining. He then devotes the final part of the book to applications that unite infrastructure and analysis to bring machine learning to bear on systematically acquired and stored data. Here the focus is on results: the strengths and weaknesses of these applications, along with their potential as foundations for further progress. From Chakrabarti's work-painstaking, critical, and forward-looking-readers will gain the theoretical and practical understanding they need to contribute to the Web mining effort.

* A comprehensive, critical exploration of statistics-based attempts to make sense of Web Mining.
* Details the special challenges associated with analyzing unstructured and semi-structured data.
* Looks at how classical Information Retrieval techniques have been modified for use with Web data.
* Focuses on today's dominant learning methods: clustering and classification, hyperlink analysis, and supervised and semi-supervised learning.
* Analyzes current applications for resource discovery and social network analysis.
* An excellent way to introduce students to especially vital applications of data mining and machine learning technology.

Special Offers and Product Promotions

  • Buy $50 in qualifying physical textbooks, get $5 in Amazon MP3 Credit. Here's how (restrictions apply)

Frequently Bought Together

Customers buy this book with Spidering Hacks $16.35

Mining the Web: Discovering Knowledge from Hypertext Data + Spidering Hacks
  • This item: Mining the Web: Discovering Knowledge from Hypertext Data

    In Stock.
    Ships from and sold by Amazon.com.
    This item ships for FREE with Super Saver Shipping. Details

  • Spidering Hacks

    In Stock.
    Ships from and sold by Amazon.com.
    Eligible for FREE Super Saver Shipping on orders over $25. Details



Editorial Reviews

Review

"...solid and beneficial to readers interested in Web data mining, especially those interested in the details of algorithmic implementation." = Bernard J. Jansen, Information Processing & Management

"The treatment is systematic, comprehensive and in-depth, yet very lucid and accessible to a wide range of Web technology developers. The author's insights and depth of knowledge as on of the pioneering researchers on hypertext information mining and retrieval are also evident in the extensive and useful bibliographic notes provided at the end of each chapter..." - Professor Joydeep Ghosh, University of Texas, Austin

"The author has done the community a great service by synthesizing all the important work in this field into an excellent book, which introduces fairly sophisticated material in an easy-to-read manner. This book for the first time, makes it possible to offer Web Mining as a real course." - Professor Jaideep Srivastava, University of Minnesota

" Mining the Web: Discovering Knowledge from Hypertext from Hypertext Data, by Soumen Chakrabarti, focuses extensively on building a better search engine crawler...Chakrabarti's book begins with a discussion of search engine crawlers in a chapter titled "Crawling the Web." The discussion in this chapter is technical and detailed. Readers learn about features such as the robots.txt file that can be written in a certain way to stop crawlers from visiting a page...The most interesting part of the book is perhaps Chapter 7, "Social Network Analysis." In this chapter, the author presents the most famous search engine algorithms (e.g., PageRank, HITS, SALSA)." - Journal of Marketing Research, Sandeep Krishnamurthy

"All in all this is an excellent book. I enjoyed the book and highly recommend it as a textbook for web data mining classes at graduate or senior undergraduate levels. Chakrabarti has a rich vocabulary and is a gifted writer. I bet he will write new, good books in the future, and he should. I look forward to them." - Fazli Can - Miami University

Book Description

The definitive book on mining the Web from the preeminent authority.

Product Details

  • Hardcover: 344 pages
  • Publisher: Morgan Kaufmann; 1 edition (October 23, 2002)
  • Language: English
  • ISBN-10: 1558607544
  • ISBN-13: 978-1558607545
  • Product Dimensions: 9.5 x 7.5 x 1 inches
  • Shipping Weight: 1.4 pounds (View shipping rates and policies)
  • Average Customer Review: 4.8 out of 5 stars  See all reviews (9 customer reviews)
  • Amazon Best Sellers Rank: #259,291 in Books (See Top 100 in Books)

More About the Author

Discover books, learn about writers, read author blogs, and more.

 

Customer Reviews

9 Reviews
5 star:
 (7)
4 star:
 (2)
3 star:    (0)
2 star:    (0)
1 star:    (0)
 
 
 
 
 
Average Customer Review
4.8 out of 5 stars (9 customer reviews)
 
 
 
 
Share your thoughts with other customers:
Most Helpful Customer Reviews

46 of 47 people found the following review helpful:
5.0 out of 5 stars Excellent, comprehensive, readable book on mining the Web, August 28, 2003
By 
Dave P (New York, NY United States) - See all my reviews
This review is from: Mining the Web: Discovering Knowledge from Hypertext Data (Hardcover)
Executive summary: This is a fabulous book, written with care and
precision, easy to read yet covering in detail a wide variety of
the most beautiful and promising developments in data mining and
machine learning as it relates to the World Wide Web, including a
prescient vision of where the field is headed in the future.

More detail: There are science authors who are clear experts in
their field, yet have trouble communicating their knowledge. Then
there are science authors who write with clarity, but achieve it
by dumbing down technical details to cater to a broad readership.
Finally, there are authors who are experts and leaders in their
field, who are actively contributing to the forefront of research,
who are excellent writers, and who can communicate complex
concepts to a diverse audience with acumen, without glossing over
important details. Soumen Chakrabarti is one such author. "Mining
the Web" is a stunning achievement. It is an excellent summary of
the past decade or so of research in the area, covering nearly all
of the important bases, including the machinery of Web crawling,
Web information retrieval (i.e., search engines), clustering,
automated classification, semi-supervised approaches, social
network analysis, and focused crawling. Though Chakrabarti himself
has contributed prominently to the field, this book is not at all
the vehicle for self-promotion that other specialist texts
sometimes feel like. The book should be valuable to newcomers,
students, and experts alike, and could certainly serve as an
excellent course textbook. High-level concepts can be grasped with
little mathematical background, yet more technically sophisticated
readers will not be disappointed: most topics do include rigorous
coverage. The text is well organized, well written, and well
conceived. It's design, including generous and illuminating
figures and illustrations, possesses an artist's touch, perhaps
not surprising given that Chakrabarti designs his own font
libraries in his (apparently scant) spare time. It's hard to
imagine where Chakrabarti found the time to write such a
comprehensive and thoughtful book, but I'm not asking any
questions: I'm thrilled with the outcome. The book is a must-have
reference for anyone working in -- or aspiring to work in -- the
crossroads of Web algorithmics, data mining, and machine learning.

David M. Pennock
Senior Research Scientist, Overture Services, Inc.
[website]

Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


11 of 12 people found the following review helpful:
5.0 out of 5 stars A wonderful textbook for machine learning over the web, September 8, 2004
This review is from: Mining the Web: Discovering Knowledge from Hypertext Data (Hardcover)
This book is one of the best computer science textbooks i have ever seen. Apart from the wealth of information and discussion on specific WEB crawling and data mining (chapters 2, 3, 7, 8), chapters 4, 5 and 6 constitute a wonderful summary of machine learning in general.

The book's discussion of unsupervised learning (the EM algorithm, advanced algorithms in which the number of clusters is not known in advance), supervised learning (Bayesian networks, entropian methods, SVMs), semisupervised learning, co-training and rule induction is extraordinary in that it is short, intuitive, does not sacrifice mathematical rigor, and accompanied by examples (all taken from information retreival over the web).
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


7 of 9 people found the following review helpful:
4.0 out of 5 stars Great coverage, but quite a few errors, June 3, 2005
This review is from: Mining the Web: Discovering Knowledge from Hypertext Data (Hardcover)
The book is an absolute must for those working in information retrieval, and in particular web information retrieval and web mining. These areas are quite hot (again) both for the academics as well as for industry. I personally enjoyed the fact that there is no discussion of semantic web research directions (Jena, OWL etc.) but others might not... The material is quite tightly brought together and very comprehensibly written. However, especially in chapters 4 and 5 there are many pages containing mathematical errors (either in the formulas or in the algorithms described.) For this reason, I rate an otherwise excellent textbook with 4 stars.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No

Share your thoughts with other customers: Create your own review
 
 
 
Most Recent Customer Reviews







Only search this product's reviews



Inside This Book (learn more)
First Sentence:
The World Wide Web is the largest and most widely known repository of hypertext. Read the first page
Key Phrases - Statistically Improbable Phrases (SIPs): (learn more)
vicinity graph, bipartite cores, mixed hubs, commercial crawlers, maximum entropy classifiers, clique attack, interdocument similarity, hub scores, hypertext classification, focused crawler, supervised learner, unlabeled documents, topic distillation, topic directories, topic taxonomies, focused crawling, random surfer, expanded graph, hypertext data, text classifier, topic directory, broad queries, authority scores, inverted index, power iterations
Key Phrases - Capitalized Phrases (CAPs): (learn more)
Open Directory, Bibliographic Notes, World Wide Web, Link Parser, Discriminative Classification, Stanford University, Discovering Communities, Search Engine Watch, Vannevar Bush
New!
Books on Related Topics | Concordance | Text Stats
Browse Sample Pages:
Front Cover | Table of Contents | First Pages | Index | Surprise Me!
Search Inside This Book:





Tags Customers Associate with This Product

 (What's this?)
Click on a tag to find related items, discussions, and people.
 
(1)

Your tags: Add your first tag
 

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 


Active discussions in related forums
Search Customer Discussions
Search all Amazon discussions
   
Related forums



So You'd Like to...


Create a guide


Look for Similar Items by Category


Look for Similar Items by Subject