- Series: Pro
- Paperback: 440 pages
- Publisher: Apress; 1st ed. edition (June 17, 2009)
- Language: English
- ISBN-10: 1430219424
- ISBN-13: 978-1430219422
- Product Dimensions: 7 x 1 x 9.2 inches
- Shipping Weight: 1.3 pounds (View shipping rates and policies)
- Average Customer Review: 7 customer reviews
- Amazon Best Sellers Rank: #2,766,025 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Pro Hadoop 1st ed. Edition
Use the Amazon App to scan ISBNs and compare prices.
Fulfillment by Amazon (FBA) is a service we offer sellers that lets them store their products in Amazon's fulfillment centers, and we directly pack, ship, and provide customer service for these products. Something we hope you'll especially enjoy: FBA items qualify for FREE Shipping and Amazon Prime.
If you're a seller, Fulfillment by Amazon can help you increase your sales. We invite you to learn more about Fulfillment by Amazon .
The Amazon Book Review
Author interviews, book reviews, editors picks, and more. Read it now
Customers who viewed this item also viewed
About the Author
Jason Venner has more than 20 years of software engineering, managing, designing, and coding experience. He has been a vice president, director, and consultant. Currently, his interests and expertise are in Java, Hadoop, cloud computing, and more. For more, visit www.prohadoopbook.com.
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
Rest of the chapters mostly concentrated on minute details of configuration of a host of different parameters.
I was looking for a book - that gave back to the readers more on the conceptual side of Hadoop and on Map Reduce - with examples of being able to solve different flavour of problems.
I just skimmed over the chapters from chapter 3 onwards - since I found the configuration details too detailed.
However, if you consider from the point of view - of how difficult it can be to setup Hadoop - may be the configurations as discussed from Chapter 3 onwards are essential.
Now that Cloudera has come up with an easy to install Hadoop install - going though configuration and setup in a book at a very detailed level seems not necessary.
The pictures and diagrams ( though very few on this book ) are not very helpful and I felt were not thoughtfully made.
The Kindle edition of this book is perfectly readable on my 6" Kindle 2, although the code samples are significantly lighter than the rest of the text.
i recommend this 10/10 (Make sure you get the book with latest hadoop API updates)
Chapter One provides detailed instructions on how to install Hadoop and how to run a test to verify that everything went fine. The author mentions that Hadoop 0.19 works best with Sun's JDK 1.6 and that although Hadoop will work on Windows with Cygwin installed, you have to be careful when specifying file paths.
Chapters Two and Three introduce basic concepts pertaining to MapReduce Jobs and Multimachine Clusters, respectively, and how "master" and "slave" nodes are configured. Chapter Four teaches you how to install, configure, and troubleshoot Hadoop Distributed File System.
Chapters Five and Six provide tutorials on the different types of inputs and outputs that a Hadoop MapReduce job can handle, and how to tune MapReduce jobs.
Chapter Seven is an excellent tutorial on how to unit test and debug MapReduce jobs, while Chapter Eight discusses more advanced MapReduce techniques for addressing more complex application requirements.
Chapter Nine walks you through the evolution of a (somewhat boring) real-world application, discussing rationales behind design changes, etc. Chapter 10 provides a few descriptive paragraphs each for various projects related to Hadoop (e.g., Pig, HBase, Mahout, ZooKeeper,etc). Finally, Appendix A is a detailed discussion of the JobConf API, JobConf being the object that controls information relating to a MapReduce job.
The author does a nice job of explaining what a MapReduce job is and how you can put it to use and get usable data out of seemingly uncomprehensible junk. This was instrumental in pitching the idea to upper management.
Chapters 2 through 5 were quite helpful while installing and setting up a cluster (and single instance) of Hadoop. There is alot of information out on the web, but it is very unstructured and difficult to follow. I don't think we could have done it without help from the book. It is worth mentioning that Cloudera does have a nice virtual machine image that you can download for free which already has everything set up. This VM image could save you alot of time during a Proof of Concept.
Chapters 8 and 9 further explain different problems and the Hadoop approaches to solving them. I'm not sure how applicable these examples are in the real world, but they definitely illustrate how you should approach a problem that you intend to solve via MapReduce with Hadoop.
Since reading this book, my team and I have successfully built a 4 machine Hadoop cluster to process logs from our application so that we may provide better analytics and better predict spammers. Pro Hadoop served as a good reference each time we hit a roadblock.
I'd recommend this book to anyone who is looking to learn more about Hadoop and MapReduce techniques and I'd say it is a must have for anyone who is looking to implement Hadoop.