Programming Books C Java PHP Python Learn more Browse Programming Books
Hadoop: The Definitive Guide and over one million other books are available for Amazon Kindle. Learn more
Trade in your item
Get a $2.25
Gift Card.
Have one to sell?
Flip to back Flip to front
Listen Playing... Paused   You're listening to a sample of the Audible audio edition.
Learn more
See all 2 images

Hadoop: The Definitive Guide Paperback

ISBN-13: 978-1449389734 ISBN-10: 1449389732 Edition: Second Edition

See all 4 formats and editions Hide other formats and editions
Amazon Price New from Used from Collectible from
Kindle
"Please retry"
Paperback
"Please retry"
$27.99 $12.29

There is a newer edition of this item:


Free%20Two-Day%20Shipping%20for%20College%20Students%20with%20Amazon%20Student



NO_CONTENT_IN_FEATURE

Shop the new tech.book(store)
New! Introducing the tech.book(store), a hub for Software Developers and Architects, Networking Administrators, TPMs, and other technology professionals to find highly-rated and highly-relevant career resources. Shop books on programming and big data, or read this week's blog posts by authors and thought-leaders in the tech industry. > Shop now

Product Details

  • Paperback: 628 pages
  • Publisher: Yahoo Press; Second Edition edition (October 12, 2010)
  • Language: English
  • ISBN-10: 1449389732
  • ISBN-13: 978-1449389734
  • Product Dimensions: 9.5 x 7 x 1.5 inches
  • Shipping Weight: 2.2 pounds
  • Average Customer Review: 4.1 out of 5 stars  See all reviews (15 customer reviews)
  • Amazon Best Sellers Rank: #322,275 in Books (See Top 100 in Books)

Editorial Reviews

About the Author

Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He works for Cloudera, a company set up to offer Hadoop support and training. Previously he was as an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, java.net and IBM's developerWorks, and has spoken at several conferences, including at ApacheCon 2008 on Hadoop. Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.


More About the Author

Discover books, learn about writers, read author blogs, and more.

Customer Reviews

4.1 out of 5 stars
5 star
8
4 star
4
3 star
0
2 star
2
1 star
1
See all 15 customer reviews
The standard hadoop book for documenting and learning hadoop.
D. Zanter
This is a good overview book of Hadoop, how it works, and the software in the Hadoop ecosystem.
Al
Understanding the concepts is the most important thing and this book provides this very nicely.
David Mark Schramm

Most Helpful Customer Reviews

25 of 27 people found the following review helpful By Eric Sammer on June 13, 2011
Format: Paperback
The second edition of the already fantastic Hadoop: The Definitive Guide adds the last few missing bits to the best Hadoop reference out there.

For those not familiar with the first edition, Hadoop: The Definitive Guide is exactly what it claims to be. If you're not already familiar with Hadoop, the first and second chapters (Meet Hadoop and MapReduce, respectively) take you through the basics in both concept as well as code. For those used to writing data processing applications, the rationale behind Hadoop and why it's useful are immediately apparent. If you've already been exposed to Hadoop, these chapters may be redundant but they're worth reading anyway the first time through.

The chapter on HDFS does a great job at explaining the underbelly of Hadoop's distributed file system including the Java APIs. The section on Hadoop IO is probably introduced a bit too early - Hadoop newbies probably don't care about compression and serialization prior to reading about map reduce - but excellent none the less in its detail. That said, you'll *really* want to go back and read it to understand the details of how compression codecs work after you learn more about map reduce.The "Writing a Map Reduce Application" chapter is probably the one existing users of Hadoop will skip. First timers will definitely get a lot out of a step by step walk through of a Java MR job from beginning to end.

The chapters on how map reduce works, types and formats (including input / output format details), and the advanced features (counters, sorting, the distributed cache, join libraries) are the ones you'll reread and reference constantly.
Read more ›
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
27 of 31 people found the following review helpful By L. Wickland on May 22, 2011
Format: Paperback
Hadoop's MapReduce and HBase went through a major API change right around the time this book was finishing up. Consequently, if you try to use the examples in the book as a guide while developing against either the Apache Hadoop latest release or against Cloudera's CDH3, you'll find a mountain of frustration in the form of deprecated or entirely deleted classes.
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
11 of 12 people found the following review helpful By David Mark Schramm on July 20, 2011
Format: Paperback Verified Purchase
This book provides an excellent in-depth overview of all aspects of Hadoop with how-to examples that are easy to follow. It is well written, thorough and exactly what I needed to architect and build a Hadoop-based solution. Related technologies such as Hive, HBase, Sqoop, Pig and Zookeeper are also covered in decent depth.

Other reviewers gave poor reviews due to the APIs being not up to date, which I think is unfair. Those new APIs are still only available in early unstable Hadoop versions, so current developers are best served to use the earlier APIs. The book gives samples with new APIs and shows very clearly the API changes which are minor. The concepts are identical, but a few classes have been combined into a more cohesive "Context" class in the new APIs.

So, for example, to write a data record you call "context.collect(...);" rather than "output.collect(...);" with identical parameters. The structure of applications and the concepts are not changed. The changes to the syntax of Java calls is trivial and covered in the book very clearly. What is the big deal? Understanding the concepts is the most important thing and this book provides this very nicely.

I would recommend this book to anyone who is new to Hadoop and needs to learn it in depth.
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
27 of 37 people found the following review helpful By Peter Harrington on November 18, 2010
Format: Paperback Verified Purchase
The APIs in this book were all outdated by the time the book hit the shelf. The authors did recognize this and mention it in the book, however you don't need 400 pages to understand the map-reduce concepts.
I think it's a bad idea trying to publish a book on a rapidly changing community project like Hadoop. I found the Cloudera (free) training materials much more helpful.
3 Comments Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
1 of 1 people found the following review helpful By JUAN JOSE DE LEON on June 8, 2012
Format: Paperback
The book has lots examples and footnote resources that enriched the content. Some people recommend watching Cloudera training videos first and then reading this book if you are a beginner, and i agree.
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
2 of 3 people found the following review helpful By P. Bhowmick on March 12, 2012
Format: Kindle Edition Verified Purchase
I had bought this book (Kindle edition), hoping it would have a good intro to programming for MapReduce. It is not. This book tries to be a lot of things: a Hadoop administration book, MapReduce programming book using the Java API, an HDFS reference book, Hadoop Streaming book and so on and so forth but succeeds on no front. The examples are trivial and it barely skims the Hadoop Java API.
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
Format: Paperback Verified Purchase
The book quality was excellent !!! I bought the used book, but it looked like new. Wow !!! So satisfied !!
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
By Robert E. Ryan on August 10, 2013
Format: Paperback Verified Purchase
Only read the first few chapters, but this material provided all that was necessary for a good understanding of what Hadoop File System does, and why it is valuable. excellent read...
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again

Product Images from Customers

Search
ARRAY(0xa0b26f90)

What Other Items Do Customers Buy After Viewing This Item?