15 of 16 people found the following review helpful:
5.0 out of 5 stars
Invaluable for serious users of See5 or C5.0, April 14, 2008
This review is from: C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning) (Paperback)
Despite its age this classic is invaluable to any serious user of See5 (Windows) or C5.0 (UNIX). C4.5 (See5/C5) is a linear classifier system that is often used for machine learning, or as a data mining tool for discovering patterns in databases. The classifiers can be in the form of either decision trees or rule sets. Just like ID3 it employs a "divide and conquer" strategy and uses entropy (information content) to compute its gain ratio (the split criteria).
C5.0 and See5 are built on C4.5, which is open source and free. However, since C5.0 and See5 are commercial products the code and the internals of the See5/C5 algorithms are not public. This is why this book is still so valuable. The first half of the book explains how C4.5 works, and describes its features, for example, partitioning, pruning, and windowing in detail. The book also discusses how C4.5 should be used, and potential problems with over-fit and non-representative data. The second half of the book gives a complete listing of the source code; 8,800 lines of C-code.
C5.0 is faster and more accurate than C4.5 and has features like cross validation, variable misclassification costs, and boost, which are features that C4.5 does not have. However, since minor misuse of See5 could have cost our company tens of millions of dollars it was important that we knew as much as possible about what we were doing, which is why this book was so valuable.
The reasons we did not use, for example, neural networks were:
(1) We had a lot of nominal data (in addition to numeric data)
(2) We had unknown attributes
(3) Our data sets were typically not very large and still we had a lot of attributes
(4) Unlike neural networks, decision trees and rule sets are human readable, possible to comprehend, and can be modified manually if necessary. Since we had problems with non-representative data but understood these problems as well as our system quite well, it was sometimes advantageous for us to modify the decision trees.
If you are in a similar situation I recommend See5/C5 as well as this book.
Help other customers find the most helpful reviews
Was this review helpful to you? Yes
No
12 of 13 people found the following review helpful:
5.0 out of 5 stars
The most clear work on Decision Trees available !, May 3, 1999
By A Customer
This review is from: C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning) (Paperback)
If you want to get introduced to Decision Trees algorithms, you must read this book. Ross Quinlan is the father of 'C 4.5' the most widely used tree algorithm. Most other algorithms (except for Chaid, which is older) are enhancements to C4.5 If you are from marketing, this is not a book for you. Why didn't you include a disk instead of so much source code pages, Ross ?
Help other customers find the most helpful reviews
Was this review helpful to you? Yes
No
2 of 2 people found the following review helpful:
4.0 out of 5 stars
Classical book - a bit pricy, February 27, 2006
This review is from: C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning) (Paperback)
This a very classical macihne learning book. The presentation of the material is very lucid. Dr. Quinlan is a great writer. However I would say that the book is a bit pricy. More than half of the book is C4.5 code. I personally would have liked more of the theory part. Also an updated edition with C5.0 algorithm will be very much welcome from the readers. I am not sure whether Dr. Quinlan has a book on C5.0 or is the enhancements over C4.5 are completely proprietory.
Overall, it is a good book to learn about the C4.5 algorithm.
Help other customers find the most helpful reviews
Was this review helpful to you? Yes
No