Sell Back Your Copy
For a $1.78 Gift Card
Trade in
Have one to sell? Sell yours here
Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (The Morgan Kaufmann Series in Data Management Systems)
 
 
Tell the Publisher!
I'd like to read this book on Kindle

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (The Morgan Kaufmann Series in Data Management Systems) [Paperback]

Ian H. Witten (Author), Eibe Frank (Author)
4.1 out of 5 stars  See all reviews (17 customer reviews)


Available from these sellers.


Textbook Student FREE Two-Day Shipping for Students. Learn more


Book Description

1558605525 978-1558605527 October 25, 1999 1

This book offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. Inside, you'll learn all you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining-including both tried-and-true techniques of the past and Java-based methods at the leading edge of contemporary research. If you're involved at any level in the work of extracting usable knowledge from large collections of data, this clearly written and effectively illustrated book will prove an invaluable resource.


Complementing the authors' instruction is a fully functional platform-independent Java software system for machine learning, available for download. Apply it to the sample data sets provided to refine your data mining skills, apply it to your own data to discern meaningful patterns and generate valuable insights, adapt it for your specialized data mining applications, or use it to develop your own machine learning schemes.



* Helps you select appropriate approaches to particular problems and to compare and evaluate the results of different techniques.
* Covers performance improvement techniques, including input preprocessing and combining output from different methods.
* Comes with downloadable machine learning software: use it to master the techniques covered inside, apply it to your own projects, and/or customize it to meet special needs.


Editorial Reviews

Amazon.com Review

Data mining techniques are used to power intelligent software, both on and off the Internet. Data Mining: Practical Machine Learning Tools explains the magic behind information extraction in a book that succeeds at bringing the latest in computer science research to any IS manager or developer. In addition, this book provides an opportunity for the authors to showcase their powerful reusable Java class library for building custom data mining software.

This text is remarkable with its comprehensive review of recent research on machine learning, all told in a very approachable style. (While there is plenty of math in some sections, the authors' explanations are always clear.) The book tours the nature of machine learning and how it can be used to find predictive patterns in data comprehensible to managers and developers alike. And they use sample data (for such topics as weather, contact lens prescriptions, and flowers) to illustrate key concepts.

After setting out to explain the types of machine learning models (like decision trees and classification rules), the book surveys algorithms used to implement them, plus strategies for improving performance and the reliability of results. Later the book turns to the authors' downloadable Weka (rhymes with "Mecca") Java class library, which lets you experiment with data mining hands-on and gets you started with this technology in custom applications. Final sections look at the bright prospects for data mining and machine learning on the Internet (for example, in Web search engines).

Precise but never pedantic, this admirably clear title delivers a real-world perspective on advantages of data mining and machine learning. Besides a programming how-to, it can be read profitably by any manager or developer who wants to see what leading-edge machine learning techniques can do for their software. --Richard Dragan

Topics covered: Data mining and machine learning basics, sample datasets and applications for data mining, machine learning vs. statistics, the ethics of data mining, generalization, concepts, attributes, missing values, decision tables and trees, classification rules, association rules, exceptions, numeric prediction, clustering, algorithms and implementations in Java, inferring rules, statistical modeling, covering algorithms, linear models, support vector machines, instance-based learning, credibility, cross-validation, probability, costs (lift charts and ROC curves), selecting attributes, data cleansing, combining multiple models (bagging, boosting, and stacking), Weka (reusable Java classes for machine learning), customizing Weka, visualizing machine learning, working with massive datasets, text mining, and e-mail and the Internet.

Review

"This is a milestone in the synthesis of data mining, data analysis, information theory, and machine learning."
—Jim Gray, Microsoft Research

Product Details

  • Paperback: 371 pages
  • Publisher: Morgan Kaufmann; 1 edition (October 25, 1999)
  • Language: English
  • ISBN-10: 1558605525
  • ISBN-13: 978-1558605527
  • Product Dimensions: 9.1 x 7.4 x 0.8 inches
  • Shipping Weight: 1.5 pounds
  • Average Customer Review: 4.1 out of 5 stars  See all reviews (17 customer reviews)
  • Amazon Best Sellers Rank: #681,026 in Books (See Top 100 in Books)

More About the Authors

Discover books, learn about writers, read author blogs, and more.

 

Customer Reviews

17 Reviews
5 star:
 (11)
4 star:
 (2)
3 star:
 (1)
2 star:
 (1)
1 star:
 (2)
 
 
 
 
 
Average Customer Review
4.1 out of 5 stars (17 customer reviews)
 
 
 
 
Share your thoughts with other customers:
Most Helpful Customer Reviews

35 of 36 people found the following review helpful:
5.0 out of 5 stars Excellent introduction to data mining algorithms, February 7, 2000
By 
Dean (San Diego, CA United States) - See all my reviews
This review is from: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (The Morgan Kaufmann Series in Data Management Systems) (Paperback)
Witten and Frank have generated a book that is readable without eliminating all technical (yes, even mathematical!) descriptions of the key data mining algorithms. And they are up-to-date, including support vector machines and boosting. There are sufficient examples of the techniques to provide readers with a good feel for what each technique can accomplish. For example, how many books can provide a readable explanation of support vector machines?

There are some quibbles, such as not including any discussion of neural networks (noted in Ch. 1 with another reference)--I believe it deserves some attention because of its widespread use. Additionally, future editions should include a least a brief summary of data preprocessing, input selection, feature creation, etc. But these are quibbles.

The Java portion of the book is not of as much interest to me, but for those wishing to implement the algorithms, it provides a nice blueprint (from the code I looked at).

For what they have undertaken, they have performed admirably, and I would highly recommend this book.

Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


26 of 27 people found the following review helpful:
5.0 out of 5 stars You HAVE to read this book!, January 28, 2000
By 
Bostjan Brumen (Tampere University of Technology at Pori, Finland & University of Maribor, Slovenia) - See all my reviews
This review is from: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (The Morgan Kaufmann Series in Data Management Systems) (Paperback)
This book is THE best book I have read about data mining. And I have read most of them (see ISBNs: 0070057796, 0471253847, 0262560976, 0201403803, 0471179809, 013743980, 0137564120, 1558605290, 1558604030). It is fresh, clear, well balanced. If your native language is not English, then you should definetly read THIS book first.

The feature that is the most important for me is "just enough statistics". That is, you can understand the processes & descriptions even if you have not wasted your life and youth studying statistics; what is needed of it to understand is given shortly and very well. Many other books are too deep or too shallow (like Berry's, which is a good introduction, but nothing more than that).

If the rating was scaled 1-6 stars, I'd give this book a 10.

Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No


26 of 27 people found the following review helpful:
5.0 out of 5 stars Excellent data mining textbook, December 3, 1999
By 
Stan Matwin (Ottawa, canada) - See all my reviews
This review is from: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations (The Morgan Kaufmann Series in Data Management Systems) (Paperback)
Broad coverage, including hot new topics: SVM, boosting and bagging, modern evaluation methods (ROC and lift curves). Well grounded in practical data mining applications, talks about DM issues outside model building, which are rarely discussed: feature engineering, data cleaning, etc. Clear and well written: illustrative examples help the presentation a lot. Describes in detail decision trees and rule learners, instance-based learning, and numerical prediction. Accompanied by the WEKA system, implementing in Java many of the methods discussed in the book, and available for download for free. An excellent hands-on textbook for an applied Machine Learening/DM class, or recommended reading for ayone who wants to understand DM. Good next step for those that have whetted their appetite with Berry and Linof's book.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No

Share your thoughts with other customers: Create your own review
 
 
 
Most Recent Customer Reviews











Only search this product's reviews



Inside This Book (learn more)
First Sentence:
"What is meant by ""structural patterns""?" Read the first page
Key Phrases - Statistically Improbable Phrases (SIPs): (learn more)
informational loss function, basic covering algorithm, category utility formula, synthetic binary attributes, contact lens data, gender parentl, subtree raising, hypermetrope yes, myope yes, machine learning schemes, root relative squared error, numeric prediction, false yes rainy, practical data mining, subtree replacement, combining multiple models, discretized attribute, classifiers package, outlook attribute, maximum margin hyperplane, holdout set, tree inducer, decision tree learner, nominal attributes, numeric attributes
Key Phrases - Capitalized Phrases (CAPs): (learn more)
Naive Bayes, Peter Peggy, Pam Ian, Discrete Estimator, Grace Ray, New Zealand, Ross Quinlan, Incorrectly Classified Instances
New!
Books on Related Topics | Concordance | Text Stats
Browse Sample Pages:
Front Cover | Table of Contents | First Pages | Index | Back Cover | Surprise Me!
Search Inside This Book:

Citations (learn more)
This book cites 22 books:
See all 22 books this book cites
 
100 books cite this book:
See all 100 books citing this book




Tags Customers Associate with This Product

 (What's this?)
Click on a tag to find related items, discussions, and people.
 
(2)
(1)

Your tags: Add your first tag
 

Sell a Digital Version of This Book in the Kindle Store

If you are a publisher or author and hold the digital rights to a book, you can sell a digital version of it in our Kindle Store. Learn more

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 


Active discussions in related forums
Search Customer Discussions
Search all Amazon discussions
   
Related forums



So You'd Like to...


Create a guide


Look for Similar Items by Category


Look for Similar Items by Subject