Most Helpful Customer Reviews
|
|
82 of 85 people found the following review helpful:
5.0 out of 5 stars
A good textbook on the technical aspects of data mining, September 7, 2000
There are a number of books on data mining. The vast majority of them are non-technical in the sense that they talk a great deal about how data mining is a glorious area, without ever getting into the nitty gritty of how data mining algorithms actually work. There are also a couple of technical textbooks on data mining that are nothing more than mistitled books on machine learning (yes, I know, the ML arena does contribute a lot towards data mining). This is the first true textbook on data mining algorithms and techniques. It covers a vast array of topics and does ample justice to the vast majority of them. In fact, it even covers semi-automated (OLAP) technologies for data mining. The book consistently uses data from a single (fictitious) organization to illustrate most concepts. This gives a strong sense of cohesion to can actually be very different techniques. One key aspect of the book is its question-and-answer format. The main arguments in favor of such a format are (1) it is a clean way introduce a new topic or concept (2) students love it when things are laid out for them. On the other hand, such an approach seems inappropriate for a graduate level text. This book is certain to become "the standard" data mining textbook.
Update (Dec 25, 2004): My opinion about this book has changed over time. I've left the 5-start rating in place, although my current rating for the book is 4 (or even 3.5) stars. The main reason is that I had to supplement most of the chapters in the book with the original research papers to give my students a more complete picture of data mining (in other words, the material can be a bit shallow).
Help other customers find the most helpful reviews
Was this review helpful to you?
|
|
|
|
|
|
19 of 19 people found the following review helpful:
4.0 out of 5 stars
Good high-level review with little mathematics., December 8, 2006
This is a great textbook for an undergraduate or layperson to the information sciences, but specialists may find it lacking depth. It is very good at identifying practices and principles that would guide a high-level planner toward a sound research program. That said, this book exhaustively covers the breadth of the modern field at the expense of formulas, algorithms, and source code that would have been valuable to an engineer or scientist with plans to implement.
* Buy this book if you require a high-level understanding of the concepts and techniques used in the field.
* Don't buy this book if you are planning to specialize in data mining, or if you have plans to implement yourself.
Help other customers find the most helpful reviews
Was this review helpful to you?
|
|
|
|
|
|
24 of 26 people found the following review helpful:
5.0 out of 5 stars
Best introduction I know, November 14, 2004
It is very easy to collect huge volumes of data - social statistics, bank records, biological data, and more - but very hard to pull useful facts out of the heap. This book is about processing large volumes of data in ways that let simple descriptions emerge.
This is an introductory level book, aimed at someone with reasonably good programming skills. A little facility with statistics might help, but certainly isn't necessary. The book starts gently, with some very basic questions: what is data mining exactly, when there seem to be so many definitions for the term? What is a data warehouse, and how does it differ from a database? Next, the authors address the data itself in terms of quality, usability, and organization for efficient access. The central chapters, 4 thhrough 8, address various kinds of query specification, kinds of relationships to extract, correlations, clustering, and classification. None of the discussions is especially deep. All, however, are presented in pseudocode or simple math that can easily be translated into working code. The careful reader learns a few basic principles that work well in many contexts: entropy maximization, Bayesian analysis, and simple stats. It may be surprising to see how little of normal statistical analysis is used. I suspect the authors assume that stats-savvy readers will already know how to apply significance testing, and that stats-naive readers don't need the distraction. The last chapters discuss complex data, where the best structure for the data and the questions to be asked of it are not at all obvious, and tools and applications used in data mining.
The book is nicely laid out as a textbook, with an orderly summary, problem set, and bibliography at the end of each chapter. The bibliography is more than just a list of names and authors - it actually helps the reader decide which references will give the best description of each of the chapter's topics.
This is a clear, usable introduction to data mining: the data it uses, the questions it answers, and the techniques for connecting them. It gives codable detail for lots of techniques, and prepares the reader for more advanced discussions. I recommend it very highly.
//wiredweird
Help other customers find the most helpful reviews
Was this review helpful to you?
|
|
|
|
|
|
Most Recent Customer Reviews
|