Join Amazon Prime and ship Two-Day for free and Overnight for $3.99. Already a member? Sign in.
Data Mining, Second Edition and over 300,000 other books are available for Amazon Kindle – Amazon’s new wireless reading device. Learn more

 

or
Sign in to turn on 1-Click ordering.
 
 
More Buying Choices
71 used & new from $35.85

Have one to sell? Sell yours here
 
   
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
 
 
Start reading Data Mining, Second Edition on your Kindle in under a minute.

Don’t have a Kindle? Get yours here.
 
  

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems) (Paperback)

by Ian H. Witten (Author), Eibe Frank (Author) "Human in vitro fertilization involves collecting several eggs from a woman's ovaries, which, after fertilization with partner or donor sperm, produce several embryos..." (more)
Key Phrases: informational loss function, category utility formula, generic object editor, Naïve Bayes, Name Function, Peter Peggy (more...)
4.0 out of 5 stars See all reviews (30 customer reviews)

List Price: $65.95
Price: $44.51 & this item ships for FREE with Super Saver Shipping. Details
You Save: $21.44 (33%)
Upgrade this book for $12.59 more, and you can read, search, and annotate every page online. See details
In Stock.
Ships from and sold by Amazon.com. Gift-wrap available.

Want it delivered Tuesday, July 14? Choose One-Day Shipping at checkout. Details
39 new from $39.94 32 used from $35.85
Also Available in: List Price: Our Price: Other Offers:
Kindle Edition (Kindle Book) $40.06
Paperback Order it used!
There is a newer edition of this item:
Data Mining, Third Edition: Practical Machine Learning Tools and Techniques (The Morgan Kaufmann Series in Data Management Systems) Data Mining, Third Edition: Practical Machine Learning Tools and Techniques (The Morgan Kaufmann Series in Data Management Systems)
Sign up to be notified when this item becomes available.
What Do Customers Ultimately Buy After Viewing This Item?

Frequently Bought Together

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems) + The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics) + Pattern Recognition and Machine Learning (Information Science and Statistics)

Customers Who Bought This Item Also Bought

Data Mining: Concepts and Techniques, Second Edition (The Morgan Kaufmann Series in Data Management Systems)

Data Mining: Concepts and Techniques, Second Edition (The Morgan Kaufmann Series in Data Management Systems)

by Micheline Kamber Jiawei Han
3.7 out of 5 stars (29)  $55.16
Machine Learning (Mcgraw-Hill International Edit)

Machine Learning (Mcgraw-Hill International Edit)

by Thomas Mitchell
4.3 out of 5 stars (38)  $74.16
Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Introduction to Machine Learning (Adaptive Computation and Machine Learning)

by Ethem Alpaydin
4.4 out of 5 stars (8)  $43.20
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)

by Christopher M. Bishop
4.0 out of 5 stars (42)  $62.89
Data Preparation for Data Mining (The Morgan Kaufmann Series in Data Management Systems)

Data Preparation for Data Mining (The Morgan Kaufmann Series in Data Management Systems)

by Dorian Pyle
4.7 out of 5 stars (11)  $53.17
Explore similar items

Editorial Reviews

Review
"This book presents this new discipline in a very accessible form: both as a text to train the next generation of practitioners and researchers, and to inform lifelong learners like myself. Witten and Frank have a passion for simple and elegant solutions. They approach each topic with this mindset, grounding all concepts in concrete examples, and urging the reader to consider the simple techniques first, and then progress to the more sophisticated ones if the simple ones prove inadequate. If you have data that you want to analyze and understand, this book and the associated Weka toolkit are an excellent way to start."
- From the foreword by Jim Gray, Microsoft Research

"It covers cutting-edge, data mining technology that forward-looking organizations use to successfully tackle problems that are complex, highly dimensional, chaotic, non-stationary (changing over time), or plagued by. The writing style is well-rounded and engaging without subjectivity, hyperbole, or ambiguity. I consider this book a classic already!"
- Dr. Tilmann Bruckhaus, StickyMinds.com

Book Description
Highly anticipated second edition of the highly-acclaimed reference on data mining and machine learning.

See all Editorial Reviews

Product Details

  • Paperback: 560 pages
  • Publisher: Morgan Kaufmann; 2 edition (June 10, 2005)
  • Language: English
  • ISBN-10: 0120884070
  • ISBN-13: 978-0120884070
  • Product Dimensions: 9.1 x 7.5 x 1.2 inches
  • Shipping Weight: 2.6 pounds (View shipping rates and policies)
  • Average Customer Review: 4.0 out of 5 stars See all reviews (30 customer reviews)
  • Amazon.com Sales Rank: #16,214 in Books (See Bestsellers in Books)

    Popular in these categories: (What's this?)

    #3 in  Books > Computers & Internet > Computer Science > Artificial Intelligence > Machine Learning
    #17 in  Books > Computers & Internet > Databases > Data Mining

Inside This Book (learn more)



Books on Related Topics (learn more)
 
 

Tags Customers Associate with This Product

 (What's this?)
Click on a tag to find related items, discussions, and people.
Check the boxes next to the tags you consider relevant or enter your own tags in the field below.
(2)
(1)

Your tags: Add your first tag
 
Help others find this product — tag it for Amazon search
No one has tagged this product for Amazon search yet. Why not be the first to suggest a search for which it should appear?

Sell a Digital Version of This Book in the Kindle Store

If you are a publisher or author and hold the digital rights to a book, you can sell a digital version of it in our Kindle Store. Learn more

 

Customer Reviews

30 Reviews
5 star:
 (15)
4 star:
 (6)
3 star:
 (5)
2 star:
 (2)
1 star:
 (2)
 
 
 
 
 
Average Customer Review
4.0 out of 5 stars (30 customer reviews)
 
 
 
 
Share your thoughts with other customers:
Most Helpful Customer Reviews

 
26 of 27 people found the following review helpful:
5.0 out of 5 stars Lucid, March 21, 2006
By Developer (Brooklyn, NY United States) - See all my reviews
I'm surprisingly please with this book. I've been reading up on the topic and associated algorithms in other books for some time; I'm a software developer but don't have a statistics background, and so felt a lot of the texts were too focused on the math and the theory while being thin on content when it came to "rubber hitting the road", or even using clear, simple examples and straight-forward notation.

This book is so well-written that it communicates the concepts clearly, lucidly and in an organized fashion. The section that introduces Bayesian probability was drop-dead simple to follow. Quite frankly, having read a few other treatments on it, I can now say that everything else I read before this was overly complicated. Brevity is the soul of wit, no?

To the reviewer who criticized the authors use of words to describe equations: This is what the authors intended to do. Would you fault them for writing in English if you wanted Greek? Not everyone who can benefit from applied data mining has the requisite background to understand the nitty gritty mathematics, nor should they have to, if they just want to understand the behavior and practical applications of the technology.
Comment Comment | Permalink | Was this review helpful to you? Yes No (Report this)



 
30 of 33 people found the following review helpful:
4.0 out of 5 stars Very helpful, April 26, 2006
By Dr. Lee D. Carlson (Baltimore, Maryland USA) - See all my reviews
(TOP 100 REVIEWER)    (REAL NAME)      
The major virtue of this book is the emphasis on practical applications and bread-and-butter techniques for accomplishing tasks that one could expect in a business environment. That is not to say that these techniques could not be used in a scientific research environment. They indeed could be, and in fact may be even easier to implement due to the long time scales that are available in research environments for processing information. In the business world however data mining has proven to be an activity that gives a substantial competitive edge, and so many businesses are seeking even more sophisticated methods of data mining and Web mining. Data mining could easily be considered to a branch of artificial intelligence (AI), due to its emphasis on learning patterns and performing classification, and the learning and classification tools it uses were discovered by individuals who would describe themselves as being researchers in artificial intelligence. But many, and it is fair to include the authors of this book, do not want to view data mining as part of artificial intelligence, since the latter stirs up discussions on the origin of intelligence, autonomous robots, and conscious machines, to paraphrase a line from chapter 8 of this book. The authors make it a point to emphasize that data mining, or "machine learning" is concerned with the algorithms for the inference of structure from data and the validation of that structure.

Along with its practical emphasis, the book includes discussions of some very interesting developments that are not usually included in books or monographs on data mining. One of these concerns the current research in `programming by demonstration.' This research is targeted towards the "ordinary" computer user who does not possess any programming knowledge but yet wants to automate predictable tasks. The only thing required from the user is knowledge of how to do the task in the usual way. As an example, the authors discuss briefly the `Familiar' system, which extracts information from user applications to make predictions and then generates explanations for the user about its predictions. Even more interesting is that it learns the tasks that are specialized for each individual user. It learns from the unique style of each user and their interaction history. One of the most interesting and powerful claims of programming by demonstration is that is domain-independent, considering the current intense interest in reasoning patterns or algorithms that can process information arising from multiple domains. In this regard a successful system would then be able to learn how to play chess from a user along with perhaps composing music. Again, the ability of a machine to reason in many domains is a step towards what many in the artificial community have called a `universal' learning machine. But the authors do not hold to this view, and in fact they open up the discussion in the chapter on the Weka workbench with a statement to the effect that there is no single learning algorithm that will work with all data mining problems. The "universal learner" they say, is an "idealistic fantasy."

Another interesting discussion included in the book is that of `co-training', which is a methodology that arises in the context of `semi-supervised learning.' In this learning scheme the input contains both unlabeled and labeled data. In co-training, one depends on the fact that the classification task depends on two different and independent perspectives. Then assuming there are a few labeled examples, a different model will be learned for each perspective, and then the models are separately used to label the unlabeled examples. Each model will contribute both negative and positive examples to the pool of labeled examples. The procedure is then repeated until the unlabeled pool is empty. This allows both models to be trained on the new pool of labeled examples. The authors point out some evidence indicating that if a (naive) Bayesian learner is used throughout this procedure, then it outperforms a learner that develops a single model from the labeled data. The intuition behind this is that using the independence of the two perspectives allows one to reduce the likelihood of an incorrect labeling. References are given for readers that want to investigate this approach in more detail, along with more brief discussions on its generalizations, such as co-EM, which involves probabilistic labeling of unlabeled data in one perspective, and how to use support vector machines in place of the naive Bayesian learner.

For the practitioner, the most useful discussion in the book concerns the evaluation of the different methods for data mining. What makes one approach to data mining better than another, and is there then a ranking of the different approaches? Can one in fact make judgments on the reliability or performance of data mining algorithms using solely the training or test data? If one had a general methodology for ranking data mining algorithms according to their performance then this would be a major advance, since this would allow a classification scheme for machine learning where one could speak of one machine being `more intelligent' than another. Unfortunately however this is difficult, and even said to be impossible according to some researchers. There are results in the research literature, going by the name of `free lunch' theorems, which seem to indicate that one cannot distinguish machine learning algorithms based solely on the way the deal with training or test data. The authors do not discuss these results in this book, but it is certainly apparent that they are aware of the difficult issues involved in the prediction of performance for data mining algorithms.
Comment Comment | Permalink | Was this review helpful to you? Yes No (Report this)



 
19 of 20 people found the following review helpful:
4.0 out of 5 stars Very readable book on Data Mining and ML, October 9, 2005
By K. Greene (Albuquerque, NM) - See all my reviews
This book is very easy to read and understand. Unlike Hastie's Statistical Learning book, it is not geared towards those with an expert level knowledge of statistics, and instead takes time to explain functions and formulas for the person with a decent but not extrordinary understanding of statistical/math concepts. For example, their description of a Gaussian was the clearest I've seen. On the other hand, if you're math/statistics background is considerable, you may find this book somewhat simplistic or tedious.

The book has a good coverage of techniques and algorithms, although I was somewhat disappointed that they do not mention Influence Diagrams, considering the amount of coverage of both decision trees and Bayesian techniques. Their discussion of Combining Multiple Models, however, is well done, and is not covered to this extent in most books I've seen. I also like how they broke out the discussion of input and output (knowledge representation) into their own chapters.

Addendum 10/30: After reading a good hunk of this book I still agree with most of what I said earlier, but I do think the authors could have gone into graphical models a lot more. At the end of the discussion on Bayesian networks, Markov networks and other graphical models are mentioned very briefly and the author says they are very big in ML right now, but he doesn't say why they didn't describe them further. It might have something to do with the organization of the book. Graphical models almost need a chapter of their own but the book's chapters discuss all techniques in one chapter but with varying levels of detail.
Comment Comment | Permalink | Was this review helpful to you? Yes No (Report this)


Share your thoughts with other customers: Create your own review
 
 
 
Most Recent Customer Reviews

4.0 out of 5 stars A data mining Book for Open source tools
Data Mining Sec Edition is a good book with a good support for related open source Weka system. You can test and exercize with no cost with data mining tecnique.
Good work
Published 2 months ago by Senni Alberto

5.0 out of 5 stars Useful addition to your shelf
This title is a most useful addition to your data mining collection, especially if you plan to do some practical experimentation with the Open Source WEKA software it describes... Read more
Published 6 months ago by Stephen Lowe

5.0 out of 5 stars A book to understand the subject
It's the best book I know because it explains the topics in a very didactic fashion and the book refers to a computer program easily available (Weka, from the University of... Read more
Published 6 months ago by Claudio Luis Sturla

3.0 out of 5 stars Not very user-friendly, too much emphasis on Weka language
This book was used as one of the two textbooks in a graduate school database course. It is hard to follow and places too much emphasis on the Weka data mining language (the... Read more
Published 6 months ago by Ada

3.0 out of 5 stars very useful academically, but not industry focused
It is a very clear and easy reading 'machine learning' book to read, but its not a 'data mining' book. Read more
Published 8 months ago by Mr. T. Manns

2.0 out of 5 stars Not particularly useful
The material is very superficially laid out and for a book with the word "Practical" in the sub-title it contains almost no practical examples of data mining.
Published 12 months ago by Dan Zetu

5.0 out of 5 stars Thorough, well-written, and crystal-clear explanations.
Highly recommend this book for a practical introduction to the theory and applications of Machine Learning. Read more
Published 13 months ago by Jay L

3.0 out of 5 stars A little too wordy for my tastes, but good
This book was pretty good. I have to admit that for the first hundred or so pages, I was feeling very impatient. Read more
Published 13 months ago by Jason C. Maestri

1.0 out of 5 stars Superficial
This book reminds me of the programming books by Deitel&Deitel. It's wordy and superficial, making lots of people feel like they understand the subject. Read more
Published 13 months ago by W. Ghost

5.0 out of 5 stars Awesome
I am very happy with amazon purchases as they always come quick, as described. I love the free supersavings shipping program. Read more
Published 17 months ago by Guzal Davletiyarova

Only search this product's reviews



Customer Discussions

 Beta (What's this?)
New! See all customer communities, and bookmark your communities to keep track of them.
This product's forum (0 discussions)
  Discussion Replies Latest Post
  No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
  [Cancel]


Active discussions in related forums
   


Product Information from the Amapedia Community

Beta (What's this?)



Look for Similar Items by Category


Summer Sales

Omaha Steaks Hamburgers
Shop the summer food sale and save up to 50% on salsas and spreads, steaks and burgers, seafood, oils and vinegars, and desserts, only at Amazon Gourmet.

See all sale items

 

Big Savings in Books

Bargain Books
Find great titles at fantastic prices in our Bargain Books Store.
 

Buy Three Books, Get a Fourth Free

4-for-3 Books
Order any four eligible books under $10 and get the lowest-price book free in our 4-for-3 Books Store. See more details.
 

Best Books

Best of the Month
See our editors' picks and more of the best new books on our Best of the Month page.
 

 

Feedback

If you need help or have a question for Customer Service, contact us.
 Would you like to update product info or give feedback on images?
Is there any other feedback you would like to provide?

Your comments can help make our site better for everyone.


Where's My Stuff?

Shipping & Returns

Need Help?

Your Recent History

  (What's this?)
You have no recently viewed items or searches.

After viewing product detail pages or search results, look here to find an easy way to navigate back to pages you are interested in.

Look to the right column to find helpful suggestions for your shopping session.

Continue shopping: Top Sellers
Paranoia
Paranoia by Joseph Finder
My Soul to Lose
My Soul to Lose by Rachel Vincent
Glenn Beck's Common Sense
Glenn Beck's Common Sense

Conditions of Use | Privacy Notice © 1996-2009, Amazon.com, Inc. or its affiliates