6 of 7 people found the following review helpful:
4.0 out of 5 stars
subjective extraction of clusters, October 18, 2006
This review is from: Survey of Text Mining I: Clustering, Classification, and Retrieval (No. 1) (Hardcover)
The book is relatively brief, given the technical nature of its chapters, each written by different authors. Many clustering methods are described. Most can be seen to have some degree of subjectivity, in defining what ends up in a given cluster. Or whether a cluster even exists or not.
The analysis of Web documents forms a major portion of the book. This data set is vast, continually changing and expanding. Plus, it is noisy. Unlike many clean data sets that might be extracted from a corpus of books, for example. Attention should be paid to methods of automatically extracting information from the Web.
The book does not go much into the higher level problems of defining ontologies. Which are very hard tasks. The closest it seems to get is along the lines of finding similar words in documents. Which is still very useful.
Help other customers find the most helpful reviews
Was this review helpful to you? Yes
No