- Paperback: 536 pages
- Publisher: Manning Publications; 1 edition (October 13, 2012)
- Language: English
- ISBN-10: 1617290238
- ISBN-13: 978-1617290237
- Product Dimensions: 7.4 x 1.3 x 9.2 inches
- Shipping Weight: 1.9 pounds (View shipping rates and policies)
- Average Customer Review: 4.0 out of 5 stars See all reviews (17 customer reviews)
- Amazon Best Sellers Rank: #1,063,274 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Hadoop in Practice: Includes 85 Techniques 1st Edition
Use the Amazon App to scan ISBNs and compare prices.
Frequently bought together
Customers who bought this item also bought
About the Author
Alex Holmes is a senior software engineer with extensive expertise in solving big data problems using Hadoop. He has presented at JavaOne and Jazoon and is a technical lead at VeriSign.
Browse award-winning titles. See more
If you are a seller for this product, would you like to suggest updates through seller support?
Top Customer Reviews
None of the installation instructions in the book will work with the newer versions of applications. In some cases the entire idea of how you would run and use a tool has changed. Also, the entire way that HDFS and Map-reduce works has changed since YARN was added, so the books explanation of that is old.
The book often omits important details like which jar you need to use for a particular piece of code. Classpath and dependency issues are always a nightmare to deal with and the book offers little help with this. He should list everything that you would put in a maven dependency. He often omits the import lines in java code, so you have little idea which class he is referring to in the code.
There are often times when he requires you to use software written by him, such as the "File Slurper" that Alex wrote. I am very wary of using any code like that, if it doesn't have the support of the apache/hadoop community then it's very likely to be out of date and unsupported sooner or later. I skipped any chapter I saw like that. I kept seeing this reference to a bash script called "run.sh" in the book, and could not figure out what he was referring to. I could find no such shell script in any software I downloaded. I think it must be a bash script in his git project, like I said I don't want to depend on any code that is not supported by the community.
There were also COUNTLESS compatibility issues I found when I tried to do anything. Almost no two pieces of hadoop software work together out of the box. It's so bad that using anything besides cloudera's hadoop was practically impossible. I am not a stupid guy either.
Here is my advice to you:
1. Use cloudera's pre-built CDH VM, at least at first. I used the CDH 4.5 pre-built VM, and that is the only thing I got to work.
2. Do not follow any installation instructions in the HIP book
3. Do not follow any installation instructions on the hadoop websites
4. Only follow installation/re-configuration instructions found in Cloudera's manual for CDH 4.5 installation
5. Do not deviate your configuration from what is norm. For example, I encountered a lot of bugs when I tried switching to java 7.
5. You might want to hold off from buying this book until a newer issue is released
6. If you use maven for dependencies, make sure you get your hadoop dependencies from the cloudera repository, not maven central
7. Instead of reading the book, just go into each of the hadoop project's websites. Skip their installation instructions like I said before, but try to follow any tutorials you see, and try to practice using everything you read.
8. After you figure out how to do everything, only then should you try to install stuff from scratch on a new VM. If you try to set up a VM on your own from the start, all the frustration will kill your motivation to learn hadoop.
The one thing this book was good for was giving me ideas of what things to try, which is why I give it two stars instead of one.
Java is definitely a pre-requisite. The book says you should have some knowledge of HDFS and MapReduce. Yet chapter one starts with "what is hadoop." It reads better as a review than an intro and doesn't fit with the rest of the book. It also assumes you haven't installed/started Hadoop. You really should read an intro book first and skim chapter one.
I particularly liked the chapters on MapReduce and performance. The overview of iostat and vmstat was clear and better than in many UNIX books. I also liked the AST explain plan. The techniques about when to use joins and sorts seemed like they would be in "Hadoop in Action" as well. Yet the comparison of different types fit well.
Each chapter begins with a conceptual overview which was very useful. The book also contains many diagrams to add clarity.
Disclosure: I received a copy of this book from the publisher in exchange for writing this review.
This is a must buy for any serious Hadoop user / developer..
Most Recent Customer Reviews
However, the comments around the charts are not that easy to read due to font/color selected.