Industrial-Sized Deals TextBTS15 Shop Men's Hightops Learn more nav_sap_SWP_6M_fly_beacon Fidlar UP3 $5 Off Fire TV Stick Off to College Essentials Shop Popular Services hog hog hog  Amazon Echo Starting at $99 Kindle Voyage Big Savings in the Amazon Fall Sportsman Event Baby Sale

MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems 1st Edition

18 customer reviews
ISBN-13: 978-1449327170
ISBN-10: 1449327176
Why is ISBN important?
ISBN
This bar-code number lets you verify that you're getting exactly the right version or edition of a book. The 13-digit and 10-digit formats both work.
Scan an ISBN with your phone
Use the Amazon App to scan ISBNs and compare prices.
Sell yours for a Gift Card
We'll buy it for $8.38
Learn More
Trade in now
Have one to sell? Sell on Amazon
Buy new
$35.75
In Stock.
Ships from and sold by Amazon.com. Gift-wrap available.
List Price: $44.99 Save: $9.24 (21%)
40 New from $25.13
Qty:1
MapReduce Design Patterns... has been added to your Cart
More Buying Choices
40 New from $25.13 21 Used from $17.44
Free Two-Day Shipping for College Students with Amazon Student Free%20Two-Day%20Shipping%20for%20College%20Students%20with%20Amazon%20Student


InterDesign Brand Store Awareness Textbooks

Frequently Bought Together

MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems + Hadoop: The Definitive Guide + Programming Hive
Price for all three: $112.74

Buy the selected items together

Special Offers and Product Promotions

Editorial Reviews

Book Description

Building Effective Algorithms and Analytics for Hadoop and Other Systems --This text refers to an alternate Paperback edition.

About the Author

Donald Miner serves as a Solutions Architect at EMC Greenplum, advising and helping customers implement and use Greenplum's big data systems. Prior to working with Greenplum, Dr. Miner architected several large-scale and mission-critical Hadoop deployments with the U.S. Government as a contractor. He is also involved in teaching, having previously instructed industry classes on Hadoop and a variety of artificial intelligence courses at the University of Maryland, BC. Dr. Miner received his PhD from the University of Maryland, BC in Computer Science, where he focused on Machine Learning and Multi-Agent Systems in his dissertation.

Adam Shook is a Software Engineer at ClearEdge IT Solutions, LLC, working with a number of big data technologies such as Hadoop, Accumulo, Pig, and ZooKeeper. Shook graduated with a B.S. in Computer Science from the University of Maryland Baltimore County (UMBC) and took a job building a new high-performance graphics engine for a game studio. Seeking new challenges, he enrolled in the graduate program at UMBC with a focus on distributed computing technologies. He quickly found development work as a U.S. government contractor on a large-scale Hadoop deployment. Shook is involved in developing and instructing training curriculum for both Hadoop and Pig. He spends what little free time he has working on side projects and playing video games.

NO_CONTENT_IN_FEATURE

Best Books of the Month
Best Books of the Month
Want to know our Editors' picks for the best books of the month? Browse Best Books of the Month, featuring our favorite new books in more than a dozen categories.

Product Details

  • Paperback: 230 pages
  • Publisher: O'Reilly Media; 1 edition (December 22, 2012)
  • Language: English
  • ISBN-10: 1449327176
  • ISBN-13: 978-1449327170
  • Product Dimensions: 7 x 0.6 x 9.2 inches
  • Shipping Weight: 12.6 ounces (View shipping rates and policies)
  • Average Customer Review: 3.9 out of 5 stars  See all reviews (18 customer reviews)
  • Amazon Best Sellers Rank: #274,654 in Books (See Top 100 in Books)

More About the Authors

Discover books, learn about writers, read author blogs, and more.

Customer Reviews

Most Helpful Customer Reviews

83 of 96 people found the following review helpful By Mark D. LaDue on January 19, 2013
Format: Paperback
In the 1990s O'Reilly books had a well-earned reputation for quality. O'Reilly authors such as Simson Garfinkel explained technical topics with precision, clarity, and wit. I proudly kept a whole shelf of O'Reilly books at work, and I imbibed copious java from their tenth anniversary mug. I'm sorry to see that O'Reilly's traditional quality has gone the way of the Internet bubble. MapReduce Design Patterns represents the absolute nadir of technical writing, and it never should have been published in its current form.

One of the most poorly written parts of the book is Appendix A on Bloom filters. As I was writing my original review of the book, I thought it might be helpful to point readers to a better explanation of the topic. Turning to Wikipedia as a potential reference, I was struck by the number of similarities between it and Appendix A. It now appears that this appendix plagiarizes the Wikipedia article "Bloom filter." To see this, compare the opening paragraph of the Wikipedia article (January 19, 2013) to the first two paragraphs of the book's appendix (which you can see in the sample pages here):

Wiki: A Bloom filter, conceived by Burton Howard Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. (Paragraph 1, sentence 1)

MRDP: Conceived by Burton Howard Bloom in 1970, a Bloom filter is a probabilistic data structure used to test whether a member is an element of a set. (Page 221, paragraph 1, sentence 1)

Wiki: False positive retrieval results are possible, but false negatives are not; i.e. a query returns either "inside set (may be wrong)" or "definitely not in set".
Read more ›
7 Comments Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
13 of 15 people found the following review helpful By Varun Sharma on January 23, 2013
Format: Paperback Verified Purchase
The book gives a good introduction to MapReduce design patterns. But what i found really missing are good examples.
I had studied Jimmy Lin's book [...]before i read this which gives some really good examples of algorithm design. I was hoping to find something which focussed on how some of the design patterns can be leveraged to implement more complicated and non-trivial algorithms in Map-Reduce more effectively.
But i feel that the book uses some fairly straightforward algorithms to explain the pattern and does not go deep.
Another thing that i did not like is that the book is just too much Hadoop specific and ignores other Map Reduce implementations which are getting very popular.
Overall the book is a good step in introducing patterns and algorithms in a more systematic manner, in the Map Reduce programming paradigm. It gives a good survey of some of the emerging areas in last few chapters. The chapter on Meta Patterns was my favorite as it gives some good introductory material on building more complicated pipelines using Map Reduce, and how one could take steps in optimizing the runtime of bigger pipelines.
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
10 of 12 people found the following review helpful By jcsenciales on April 29, 2013
Format: Paperback
It's a good book that show you a lot of Mapreduce patterns using Hadoop.

But the main trouble it's that you can not trust the examples source code, at all. I Clone the Code Github in my Mac and I've found several bugs..

https://github.com/adamjshook/mapreducepatterns

I'm running the Book's code using a macbook with:

- hadoop-1.0.4
- Mac OS ver 10.6.8
- Java ver "1.6.0_43"
- Eclipse
- Data for running the examples from ( Stack Exchange Data Dump - Dec 2011 _Update_ )

For now, these are the bugs I've found:

Page: 31
The error is in the MedianStdDevCombiner code.
I'm looking for a bug in this full example because when you execute it ,you obtain different result from the previous normal Median and Standard deviation using the same input data. The result obtained is nearly double values from the previous example, when need to be the same results.

Page: 35-36
The error i found is in the Inverted Index Example.
In the Mapper Function if "getWikipediaURL" return a null value then you get a nullpointerException because you need to check if the result of this function is null prior to set the "link" variable value.

Page 117-118
In ReduceSideJoinWithBloomDriver Code from github doesn't exist any reference to load the bloom filter from any argument... [something like DistributedCache.addCacheFile(...... ], this file is nearly a Copy/paste from the previous ReduceSideJoin.java.

Page 122:
In ReplicatedJoinMapper you always get a java.io.FileNotFoundException because this code want to read and decompress a folder , not a concrete "file.gz", inside this folder. You only need to add a index to your files inside the DistributedCache.
Read more ›
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
9 of 11 people found the following review helpful By Charles Feduke on January 16, 2013
Format: Kindle Edition
This book is a good catalog of the different patterns any big data solutions programmer should know in order to effectively perform their job. While the authors admit that writing some of these patterns in the context of a map/reduce job on Hadoop with tools like Pig available can be counterproductive they make the compelling argument that understanding these patterns is still important.

The technical examples in the book are sometimes missing blocks of code, which while easily derived may be a source of frustration for some readers. (I have my implementations of the exercises on github, under my username of cfeduke; I learn best by doing, so keying in and executing examples is paramount.)

I've had a moderate level of experience with Hadoop, from 0.18 to 1.x, before tackling this book. I felt that this book taught me a fair amount about the guts of writing a map/reduce job though if I did not have a solid foundation working with Hadoop the examples may have been difficult to grok.

The authors chose to use Stack Overflow community data to demonstrate the patterns presented and I felt that was an excellent decision as its easy to derive other queries to answer - and implement - having some knowledge of the corpus.
1 Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again

Most Recent Customer Reviews

Set up an Amazon Giveaway

Amazon Giveaway allows you to run promotional giveaways in order to create buzz, reward your audience, and attract new followers and customers. Learn more
MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems
This item: MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems
Price: $35.75
Ships from and sold by Amazon.com

What Other Items Do Customers Buy After Viewing This Item?

Want to discover more products? Check out these pages to see more: elements of design patterns in it, fashion design software