- Hardcover: 344 pages
- Publisher: Morgan Kaufmann; 1 edition (October 23, 2002)
- Language: English
- ISBN-10: 1558607544
- ISBN-13: 978-1558607545
- Product Dimensions: 7.5 x 1 x 9.5 inches
- Shipping Weight: 1.4 pounds (View shipping rates and policies)
- Average Customer Review: 10 customer reviews
- Amazon Best Sellers Rank: #1,184,916 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Mining the Web: Discovering Knowledge from Hypertext Data 1st Edition
Use the Amazon App to scan ISBNs and compare prices.
Fulfillment by Amazon (FBA) is a service we offer sellers that lets them store their products in Amazon's fulfillment centers, and we directly pack, ship, and provide customer service for these products. Something we hope you'll especially enjoy: FBA items qualify for FREE Shipping and Amazon Prime.
If you're a seller, Fulfillment by Amazon can help you increase your sales. We invite you to learn more about Fulfillment by Amazon .
Frequently bought together
Customers who viewed this item also viewed
Customers who bought this item also bought
"...solid and beneficial to readers interested in Web data mining, especially those interested in the details of algorithmic implementation." = Bernard J. Jansen, Information Processing & Management
"The treatment is systematic, comprehensive and in-depth, yet very lucid and accessible to a wide range of Web technology developers. The author's insights and depth of knowledge as on of the pioneering researchers on hypertext information mining and retrieval are also evident in the extensive and useful bibliographic notes provided at the end of each chapter..." - Professor Joydeep Ghosh, University of Texas, Austin
"The author has done the community a great service by synthesizing all the important work in this field into an excellent book, which introduces fairly sophisticated material in an easy-to-read manner. This book for the first time, makes it possible to offer Web Mining as a real course." - Professor Jaideep Srivastava, University of Minnesota
" Mining the Web: Discovering Knowledge from Hypertext from Hypertext Data, by Soumen Chakrabarti, focuses extensively on building a better search engine crawler...Chakrabarti's book begins with a discussion of search engine crawlers in a chapter titled "Crawling the Web." The discussion in this chapter is technical and detailed. Readers learn about features such as the robots.txt file that can be written in a certain way to stop crawlers from visiting a page...The most interesting part of the book is perhaps Chapter 7, "Social Network Analysis." In this chapter, the author presents the most famous search engine algorithms (e.g., PageRank, HITS, SALSA)." - Journal of Marketing Research, Sandeep Krishnamurthy
"All in all this is an excellent book. I enjoyed the book and highly recommend it as a textbook for web data mining classes at graduate or senior undergraduate levels. Chakrabarti has a rich vocabulary and is a gifted writer. I bet he will write new, good books in the future, and he should. I look forward to them." - Fazli Can - Miami University
The definitive book on mining the Web from the preeminent authority.
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
precision, easy to read yet covering in detail a wide variety of
the most beautiful and promising developments in data mining and
machine learning as it relates to the World Wide Web, including a
prescient vision of where the field is headed in the future.
More detail: There are science authors who are clear experts in
their field, yet have trouble communicating their knowledge. Then
there are science authors who write with clarity, but achieve it
by dumbing down technical details to cater to a broad readership.
Finally, there are authors who are experts and leaders in their
field, who are actively contributing to the forefront of research,
who are excellent writers, and who can communicate complex
concepts to a diverse audience with acumen, without glossing over
important details. Soumen Chakrabarti is one such author. "Mining
the Web" is a stunning achievement. It is an excellent summary of
the past decade or so of research in the area, covering nearly all
of the important bases, including the machinery of Web crawling,
Web information retrieval (i.e., search engines), clustering,
automated classification, semi-supervised approaches, social
network analysis, and focused crawling. Though Chakrabarti himself
has contributed prominently to the field, this book is not at all
the vehicle for self-promotion that other specialist texts
sometimes feel like. The book should be valuable to newcomers,
students, and experts alike, and could certainly serve as an
excellent course textbook. High-level concepts can be grasped with
little mathematical background, yet more technically sophisticated
readers will not be disappointed: most topics do include rigorous
coverage. The text is well organized, well written, and well
conceived. It's design, including generous and illuminating
figures and illustrations, possesses an artist's touch, perhaps
not surprising given that Chakrabarti designs his own font
libraries in his (apparently scant) spare time. It's hard to
imagine where Chakrabarti found the time to write such a
comprehensive and thoughtful book, but I'm not asking any
questions: I'm thrilled with the outcome. The book is a must-have
reference for anyone working in -- or aspiring to work in -- the
crossroads of Web algorithmics, data mining, and machine learning.
David M. Pennock
Senior Research Scientist, Overture Services, Inc.
The book's discussion of unsupervised learning (the EM algorithm, advanced algorithms in which the number of clusters is not known in advance), supervised learning (Bayesian networks, entropian methods, SVMs), semisupervised learning, co-training and rule induction is extraordinary in that it is short, intuitive, does not sacrifice mathematical rigor, and accompanied by examples (all taken from information retreival over the web).
The first part of the book deals with interesting practical and theoretical issues related with designing large-scale Web crawlers and search engines. Chapter 4 and 5 are a good introduction to various unsupervised and supervised learning methods. Although proper understanding of advanced methods like the LSI are possible only through adequate foundation in linear algebra (you can get only a flavor of the technique in the book). Part III of the book is my personal favorite. It has detailed description of various social network analysis methods, some of which have been applied by modern search engines like Google. Focused crawling, an area that the author has personally shaped, is also explained well. The book ends with a brief peek into the future of Web mining.
The comprehensive yet easy to read nature of the book makes it a valuable addition to my shelf. It is hard to find a comparable book in the area of Web mining.