Programming Collective Intelligence: Building Smart Web 2.0 Applications 1st Edition
| Toby Segaran (Author) Find all the books, read about the author, and more. See search results for this author |
Use the Amazon App to scan ISBNs and compare prices.
- Collaborative filtering techniques that enable online retailers to recommend products or media
- Methods of clustering to detect groups of similar items in a large dataset
- Search engine features--crawlers, indexers, query engines, and the PageRank algorithm
- Optimization algorithms that search millions of possible solutions to a problem and choose the best one
- Bayesian filtering, used in spam filters for classifying documents based on word types and other features
- Using decision trees not only to make predictions, but to model the way decisions are made
- Predicting numerical values rather than classifications to build price models
- Support vector machines to match people in online dating sites
- Non-negative matrix factorization to find the independent features in adataset
- Evolving intelligence for problem solving--how a computer develops its skill by improving its own code the more it plays a game
"Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."
-- Dan Russell, Google
"Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today. If I had this book two years ago, it would have saved precious time going down some fruitless paths."
-- Tim Wolters, CTO, Collective Intellect
Customers who viewed this item also viewed
Editorial Reviews
About the Author
Product details
- Publisher : O'Reilly Media; 1st edition (September 11, 2007)
- Language : English
- Paperback : 362 pages
- ISBN-10 : 0596529325
- ISBN-13 : 978-0596529321
- Item Weight : 1.27 pounds
- Dimensions : 7 x 0.9 x 9.19 inches
- Best Sellers Rank: #464,012 in Books (See Top 100 in Books)
- #127 in Computer Algorithms
- #247 in Artificial Intelligence (Books)
- #306 in Programming Algorithms
- Customer Reviews:
About the author

Toby Segaran is the author of "Programming Collective Intelligence," one of Amazon's top-selling AI books of all time. His latest titles, "Programming the Semantic Web" and "Beautiful Data" were released in July. He speaks on the subjects of machine learning, collective intelligence and freedom of data at conferences worldwide.
He currently holds the title of Data Magnate at Metaweb Technologies, where he works on large-scale data reconciliation problems. He is also a cofounder of freerisk.org, a non-profit aimed at creating more financial transparency.
Prior to Metaweb he founded Incellico, a biotechbology software company acquired in 2003. He holds a B.Sc. in Computer Science from MIT and US Government deems him a "Person of Exceptional Ability."
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonTop reviews from the United States
There was a problem filtering reviews right now. Please try again later.
The biggest problem with the book are the incorrect lines of code that pop up every once and a while. This is annoying, and at least one of the errors was incredibly subtle (importing pylab overwrote some of the random functions from the library random). The errata onlin has some helpful answers, but also is full of things that are not.
Given the book's age, it also is written with python2. Python3 is much preferred now, and pretty much all of the libraries used in the book are available for python3 (though they sometimes change names). This requires minor translation efforts. The major problem is that many of the API relying on websites on the internet no longer exist. This means some of the exercises are not possible, or you will have to use some other API. Pandas can make up for some of these deficiencies.
Overall, I found the book enlightening and am glad it forced me to actually write out and get the code working for myself. Experience is a great teacher and if you go through what's given in the book, you'll solidly understand the basics as well as be able to use many algorithms (genetic algorithms, naive Bayesian classifiers, recommendation methods, optimization, some database ideas) that should be useful in commonly faced problems. I did not go through many of the exercises at the end of the chapters, but most of them are straightforward applications are extensions of the ideas in the book, and seem like they would provide a decent challenge.
If you'd like to understand the basics of the above ideas, and have code to use them on data with, then this is a fairly good choice. I would recommend looking around to see if people recommend a newer edition or book with similar ideas, however, since it is a book written over a decade ago. On the other hand, the explanations and figures are helpful and do not over-complicate things.
This book is for those who realise programming, no matter what language, can do amazing things once you understand some simple concepts to tell a story through data. It gets you out of the mind set of, "I have some data stored here, and I will present it here". Instead, "I have some data stored here, how do I show, create understanding, explore, wedge out, predict, recommend it here"
Most of the topics presented in this book are not new in any sense, however they are not old either. They're tried and proven methods for creating meaning from datasets. They will be used for decades to come because they work! There are other books on the topics presented, like I said they are not new, however the simplicity of Python provides a frictionless entry for anyone wanting to get up and running with out a bloated IDE or framework to make it happen.
Those who are thinking, "well it's Python, and Python can't do X", I say to you a language does not determine what can and can not do it is the developer. At the end of the day the capability of the developer determines what the language can and can't do. If it seriously can't do something then build an extension to the language! With this thinking you can port what is presented in this book to any language. Python was chosen for it's simple constructs and readability.
If you're ever going to by a book on this topic buy this. Not the kindle, but the hard copy. The kindle version I've found doesn't present well for the code sections.
Overall this book is a great reference and is also a great primer if wanting to go deeper. It will allow you to tackle your next project with a different mindset and allow your users to discover and learn new things about their online surroundings and themselves!
Top reviews from other countries
Although Python based I found the algorithms clear enough to translate into Erlang without much issue and if you have any functional exposure this should not be difficult.
I'm math phobic but this book really generated my interest with its clarity and applicative nature to everyday problems.
I took this over the competing CI book as that was heavily Java oriented - why do some people mangle Java to any possible task?
The elegance of Python to AI problems really shone through here!
Great Book and not too long either - but certainly not short on knowledge!









