or
Sign in to turn on 1-Click ordering
Sell Us Your Item
For a $7.85 Gift Card
Trade in
More Buying Choices
Have one to sell? Sell yours here
Tell the Publisher!
I'd like to read this book on Kindle

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.
Sorry, this item is not available in
Image not available for
Color:
Image not available

To view this video download Flash Player

 

Algorithms of the Intelligent Web [Paperback]

Haralambos Marmanis , Dmitry Babenko
4.4 out of 5 stars  See all reviews (14 customer reviews)

Buy New
$26.50 & FREE Shipping. Details
Rent
$15.40
Only 16 left in stock (more on the way).
Ships from and sold by Amazon.com. Gift-wrap available.
In Stock.
Rented by RentU and Fulfilled by Amazon.
Want it Tuesday, May 21? Choose One-Day Shipping at checkout. Details
Free Two-Day Shipping for College Students with Amazon Student

Formats

Amazon Price New from Used from
Paperback $26.50  
Shop the new tech.book(store)
New! Introducing the tech.book(store), a hub for Software Developers and Architects, Networking Administrators, TPMs, and other technology professionals to find highly-rated and highly-relevant career resources. Shop books on programming and big data, or read this week's blog posts by authors and thought-leaders in the tech industry. > Shop now

Book Description

July 5, 2009 1933988665 978-1933988665 1

Web 2.0 applications provide a rich user experience, but the parts you can't see are just as important-and impressive. They use powerful techniques to process information intelligently and offer features based on patterns and relationships in data. Algorithms of the Intelligent Web shows readers how to use the same techniques employed by household names like Google Ad Sense, Netflix, and Amazon to transform raw data into actionable information.

Algorithms of the Intelligent Web is an example-driven blueprint for creating applications that collect, analyze, and act on the massive quantities of data users leave in their wake as they use the web. Readers learn to build Netflix-style recommendation engines, and how to apply the same techniques to social-networking sites. See how click-trace analysis can result in smarter ad rotations. All the examples are designed both to be reused and to illustrate a general technique- an algorithm-that applies to a broad range of scenarios.

As they work through the book's many examples, readers learn about recommendation systems, search and ranking, automatic grouping of similar objects, classification of objects, forecasting models, and autonomous agents. They also become familiar with a large number of open-source libraries and SDKs, and freely available APIs from the hottest sites on the internet, such as Facebook, Google, eBay, and Yahoo.

Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.


Frequently Bought Together

Algorithms of the Intelligent Web + Programming Collective Intelligence: Building Smart Web 2.0 Applications
Price for both: $51.70

Buy the selected items together


Editorial Reviews

About the Author

Dr. Haralambos (Babis) Marmanis is a pioneer in the adoption of machine learning techniques for industrial solutions, and also a world expert in supply management. He has about twenty years of experience in developing professional software. Currently, he is the director of R&D and chief architect, for expense management solutions, at Emptoris, Inc. Babis holds a Ph.D. in applied mathematics from Brown University, an M.S. degree in theoretical and applied mechanics from the University of Illinois at Urbana-Champaign, and B.S. and M.S. degrees in civil engineering from the Aristotle University of Thessaloniki in Greece. He was the recipient of the Sigma Xi award for innovative research in 2000, and he is the author of numerous publications in peer-reviewed international scientific journals, conferences, and technical periodicals.


Dmitry Babenko is the lead for the data warehouse infrastructure at Emptoris, Inc. He is a software engineer and architect with 13 years of experience in the IT industry. He has designed and built a wide variety of applications and infrastructure frameworks for banking, insurance, supply-chain management, and business intelligence companies. He received a M.S. degree in computer science from Belarussian State University of Informatics and Radioelectronics.

Product Details

  • Paperback: 368 pages
  • Publisher: Manning Publications; 1 edition (July 5, 2009)
  • Language: English
  • ISBN-10: 1933988665
  • ISBN-13: 978-1933988665
  • Product Dimensions: 7.4 x 0.8 x 9.2 inches
  • Shipping Weight: 1.4 pounds (View shipping rates and policies)
  • Average Customer Review: 4.4 out of 5 stars  See all reviews (14 customer reviews)
  • Amazon Best Sellers Rank: #110,042 in Books (See Top 100 in Books)

More About the Author

I live and work in Massachusetts. I like to write books on topics of practical importance and bring a unique perspective into each one of them. I particularly like dissecting areas where knowledge has grown organically or writing about subjects that are considered hard to comprehend but they shouldn't be.

I have had an extensive career in the industry, where I worked in many fields as a software professional. My academic background is quite diverse and includes a Ph.D. in Applied Mathematics from Brown University, a M.Sc. in Theoretical & Applied Mechanics from the University of Illinois at Urbana-Champaign, and a Diploma in Civil Engineering from Aristotle University of Thessaloniki. You can find my academic works by searching on Google Scholar with my last name.

You can find all my books here, on Amazon, and you can always reach me in the following address: h AT marmanis DOT com

Customer Reviews

The writing style is superb! Michael Mimo  |  5 reviewers made a similar statement
Running BeanShell References Appendix B. Web crawling Section B.1. calvinnme  |  1 reviewer made a similar statement
Most Helpful Customer Reviews
86 of 90 people found the following review helpful
Format:Paperback
I have always had an interest in AI, machine learning, and data mining but I found the introductory books too mathematical and focused mostly on solving academic problems rather than real-world industrial problems. So, I was curious to see what this book was about.

I have read the book front-to-back (twice!) before I write this report. I started reading the electronic version a couple of months ago and read the paper print again over the weekend. This is the best practical book in machine learning that you can buy today -- period. All the examples are written in Java and all algorithms are explained in plain English. The writing style is superb! The book was written by one author (Marmanis) while the other one (Babenko) contributed in the source code, so there are no gaps in the narrative; it is engaging, pleasant, and fluent. The author leads the reader from the very introductory concepts to some fairly advanced topics. Some of the topics are covered in the book and some are left as an exercise at the end of each chapter (there is a "To Do" section, which was a wonderful idea!). I did not like some of the figures (they were probably made by the authors not an artist) but this was only a minor aesthetic inconvenience.

The book covers four cornerstones of machine learning and intelligence, i.e. intelligent search, recommendations, clustering, and classification. It also covers a subject that today you can find only in the academic literature, i.e. combination techniques. Combination techniques are very powerful and although the author presents the techniques in the context of classifiers, it is clear that the same can be done for ecommendations -- as the Bell Korr team did for the Netflix prize.

I work in a financial company and a number of people that I work with have PhD degrees in mathematics and computer science. I found the book so fascinating that I asked them to have a look. They had nothing but praise for this book. The consensus is that everything is explained in the simplest possible way, with clarity but without sacrificing accuracy. As one of them told me, this is a major step forward in teaching AI techniques and introducing the field to millions of developers around the world. Even for experts in the field and experienced software engineers, there are important insights in almost every chapter.

We had tried to write a software library, for a small project, that analyzes log files and assesses IT risk (e.g. probability of intrusion; preemptive alerts on application performance issues, and so on) based on Segaran's book "Programming collective intelligence". We spend about six weeks trying to find how to match what was in Segaran's book and what we wanted to do but we did not find the depth and clarity that was required. On top of that, Segaran used Python so the code had to be rewritten and things didn't quite work as expected! We are now using the code from Marmanis' book and our code analyzes apache and weblogic log files in order to assess risk! It just works! We wrote the code in one week! We would not have been able to succeed without reading this book.

Clearly, I am deeply impressed. This is an outstanding book; it was not just useful, it was inspiring! It is a "must have" book for every Java developer.

The content of the book includes:
* the PageRank algorithm; a content based algorithm similar to PageRank to which the author coined the term "DocRank" because it applies to Word, PDF, and other documents rather than Web pages; search improvements based on probabilistic methods (Naive Bayes); precision, recall, F1-score, and ROC curves;
* collaborative filtering as well as content based recommendations;
* k-means, ROCK, DBSCAN for clustering; the best explanation about the "curse of dimensionality" ever! I finally learned what this mystic term means!
* Bayesian classification; declarative programming (through the Drools rules engine); introduction to neural networks; decision trees
* Comparing and Combining classifiers: McNemar's test; Cochran'sQ test; F-test; Bagging; Boosting; general classifier ensembles

Buy it, read it, enjoy it, and use it!
Was this review helpful to you?
40 of 43 people found the following review helpful
Format:Paperback
This is a book that is for the working professional who already knows Java and wants to not only implement intelligent algorithms, he/she wants to understand the theory behind it. All of the code is in Java, so if you don't know this language this book will be over your head. It would also help if you have some background in algorithms along the lines of the material covered in Introduction to Algorithms.

The author is attempting to teach both the algorithms behind the information retrieval that is done on the web and at the same time show those algorithms implemented in Java in such a way that it is clear to the reader what has been done. This approach can be a tricky middle ground often resulting in books that are confusing from both a textbook and from a cookbook standpoint. Fortunately, the author has done a good job of integrating these two viewpoints into a cohesive whole and the result is a book I can heartily recommend. The author makes liberal use of figures and explains what is being done at a high level first, showing pseudocode before actually showing the Java code. Discussions on the inner workings of the algorithms follow.

Note that use is made of higher level libraries such as Lucene when they are available, because this is a book for professionals after all, and your boss would not be pleased if you reinvented the wheel every time you implemented an algorithm. But, don't worry, the explanation behind the code is there too. Another good book that is language agnostic that makes a good companion to this one is Machine Learning (Mcgraw-Hill International Edit). It is an oldie but a goodie.

The product description does not yet show the table of contents so I do that next:

Chapter 1. What is the intelligent web?
Section 1.1. Examples of intelligent web applications
Section 1.2. Basic elements of intelligent applications
Section 1.3. What applications can benefit from intelligence?
Section 1.4. How can I build intelligence in my own application?
Section 1.5. Machine learning, data mining, and all that
Section 1.6. Eight fallacies of intelligent applications
Section 1.7. Summary
References

Chapter 2. Searching
Section 2.1. Searching with Lucene
Section 2.2. Why search beyond indexing?
Section 2.3. Improving search results based on link analysis
Section 2.4. Improving search results based on user clicks
Section 2.5. Ranking Word, PDF, and other documents without links
Section 2.6. Large-scale implementation issues
Section 2.7. Is what you got what you want? Precision and recall
Section 2.8. Summary
Section 2.9. To do
References

Chapter 3. Creating suggestions and recommendations
Section 3.1. An online music store: the basic concepts
Section 3.2. How do recommendation engines work?
Section 3.3. Recommending friends, articles, and news stories
Section 3.4. Recommending movies on a site such as[...]
Section 3.5. Large-scale implementation and evaluation issues
Section 3.6. Summary
Section 3.7. To Do
References

Chapter 4. Clustering: grouping things together
Section 4.1. The need for clustering
Section 4.2. An overview of clustering algorithms
Section 4.3. Link-based algorithms
Section 4.4. The k-means algorithm
Section 4.5. Robust Clustering Using Links (ROCK)
Section 4.6. DBSCAN
Section 4.7. Clustering issues in very large datasets
Section 4.8. Summary
Section 4.9. To Do
References

Chapter 5. Classification: placing things where they belong
Section 5.1. The need for classification
Section 5.2. An overview of classifiers
Section 5.3. Automatic categorization of emails and spam filtering
Section 5.4. Fraud detection with neural networks
Section 5.5. Are your results credible?
Section 5.6. Classification with very large datasets
Section 5.7. Summary
Section 5.8. To do
References
Classification schemes
Books and articles

Chapter 6. Combining classifiers
Section 6.1. Credit worthiness: a case study for combining classifiers
Section 6.2. Credit evaluation with a single classifier
Section 6.3. Comparing multiple classifiers on the same data
Section 6.4. Bagging: bootstrap aggregating
Section 6.5. Boosting: an iterative improvement approach
Section 6.6. Summary
Section 6.7. To Do
References

Chapter 7. Putting it all together: an intelligent news portal
Section 7.1. An overview of the functionality
Section 7.2. Getting and cleansing content
Section 7.3. Searching for news stories
Section 7.4. Assigning news categories
Section 7.5. Building news groups with the NewsProcessor class
Section 7.6. Dynamic content based on the user's ratings
Section 7.7. Summary
Section 7.8. To do
References

Appendix A. Introduction to BeanShell
Section A.1. What is BeanShell?
Section A.2. Why use BeanShell?
Section A.3. Running BeanShell
References

Appendix B. Web crawling
Section B.1. An overview of crawler components
References

Appendix C. Mathematical refresher
Section C.1. Vectors and matrices
Section C.2. Measuring distances
Section C.3. Advanced matrix methods
References

Appendix D. Natural language processing
References

Appendix E. Neural networks
References
Comment | 
Was this review helpful to you?
34 of 39 people found the following review helpful
2.0 out of 5 stars Mislead by the other reviews March 22, 2011
Format:Paperback
I selected this book as the text for a course on the basis of the earlier reviews. They sounded so good. Covers the concepts and includes concrete code that does what the concepts intend. But the book didn't live up to the reviews.

First of all, the code uses BeanShell as a way to run the examples. BeanShell is a neat idea. It's one of a number of languages that move Java closer to being a scripting language. But it's not necessary for the book's purposes. It's a bit of a pain to install, and it takes a while to get used to. In the end it's an unnecessary distraction. It's far simpler to run the examples in eclipse with the "scripts" entered as the body of a main() method.

The preceding is a relatively minor point, but in some ways it illustrates some of the problems I had with the book. It focuses too much on the code. Yes, it's nice to have code that does what one is trying to describe, but code is not a substitute for a good explanation. In many places the book provides inadequate descriptions of the concepts, presumably on the grounds that one can just read the code. But code is not tutorial. Code itself must be commented to be understandable. And code cannot replace a good intuitive description of the important ideas.

Furthermore, the code (and the output) take up too much space in the book. There are pages of output when a few lines would suffice, and there are pages of code when a well-constructed paragraph would do. Pearson's coefficient is a good example. There is approximately a page of code to do the calucuation. There is also half a page of code-level comments--e.g., "The method getAverage is self-explanatory; it calculates the average of the vector that's provided as an argument." But there is no straightforward description of what's going on in the computation as a whole.

Further, there are frequent references to other books and papers as if making such references excuse the author from explaining an idea. For example, the Pearson's coefficient discussion includes this sentence. "There's a smarter way to do this that avoids a plague of numerical calculations called the roundoff error; read the article on the corrected two-pass algorithm by Chan, Golub, and LeVeque." That's all that's said about roundoff error or the smarter way to do something or other numerical computation issues. Referring the reader to an article is not good enough. If it's worth discussing, discuss it. If it's not important enough to discuss, then don't refer the reader elsewhere except as enrichment. The author does this over and over.

Another example of what I would consider the book's conceptual superficiality is its treatment of Bayes' Theorem. Bayes' Theorem is mentioned many times, but the only explanation is a half page translation of Bayes' Theorem into words--with no explanation of why Bayes' Theorem is true. Understanding why Bayes' Theorem is true should have been an important lesson.

A similar criticism holds for Decision Trees except even more so. There is no discussion at all about how to construct a decision tree. Such a discussion would have been a perfect place to introduce the notion of entropy. But the word "entropy" doesn't even appear in the book.

All-in-all I found the book disappointing. If one wants to build software that performs some of the functions discussed, the book can help. But if one wants to understand the principles underlying such software, the book is not the right place to go.
Was this review helpful to you?
Most Recent Customer Reviews
5.0 out of 5 stars Excellent book
This book was my introduction to machine learning after many years in aerospace and radar. It allowed me to ramp up and become productive in this new field. Read more
Published 6 months ago by Andrew
5.0 out of 5 stars Fine tuned text which neatly balances between science and craft
First and foremost, congratulations for authors for such a high quality text and narrative style which artfully slaloms between pop and sci. Read more
Published 8 months ago by Srecko Gnjidic
3.0 out of 5 stars Open source and Java oriented cook book for searching, recommending,...
Artificial intelligence books are usually very aesthetic in deep thought and mathematics. This book is different: it's a fast and astonishingly easy "hands on" introduction. Read more
Published 23 months ago by ws__
5.0 out of 5 stars Wonderful book - that gently introduces the algorithms using real life...
This is a book that comes from an author whom I know personally and worked with.
A superb book by its own - which comes from an author who is an authority on the subject... Read more
Published 24 months ago by Sumit Pal
5.0 out of 5 stars Best title I've seen
I did some semtech stuff about 10 years ago and am back doing it now. The lit then was almost nonexistent and what did exist was mostly a cynical effort to extract cash from... Read more
Published on May 6, 2011 by Carey
4.0 out of 5 stars Vary good algorithms book
Thats only one reason why i dont give 5 ti this book, i don't like Java;) For All other things in the book 5+
Published on April 8, 2011 by Andrew Derevo
4.0 out of 5 stars Good Book
Algorithms of the Intelligent Web attempts to provide a summary of current techniques for providing intelligence for web applications. Read more
Published on March 9, 2011 by Andrei Mouravski
4.0 out of 5 stars good resource
I felt delighted about theory that the book apply about achieving an ideal smart application.
I didn't feel very well about working tools examples; was complicated or not well... Read more
Published on March 2, 2011 by Luiz
5.0 out of 5 stars Instructive and entertaining review of algorithms and techniques...
This books is not a "heavy" Artificial Intelligence tome. Instead it is a thought-provoking, instructive and very enjoyable read. Read more
Published on December 8, 2009 by Robin Hillyard
5.0 out of 5 stars Great insight into intelligent web designs
I do have a decent background in crawling and indexing websites. This book was a pleasant read and gave a to-the-point, no-nonsense
technical guidance for IR techniques. Read more
Published on October 18, 2009 by SwDev in NYC
Search Customer Reviews
Only search this product's reviews

What Other Items Do Customers Buy After Viewing This Item?


Sell a Digital Version of This Book in the Kindle Store

If you are a publisher or author and hold the digital rights to a book, you can sell a digital version of it in our Kindle Store. Learn more

Forums

There are no discussions about this product yet.
Be the first to discuss this product with the community.
Start a new discussion
Topic:
First post:
Prompts for sign-in
 



So You'd Like to...


Create a guide


Look for Similar Items by Category