Pentaho Kettle Solutions and over one million other books are available for Amazon Kindle. Learn more



or
Sign in to turn on 1-Click ordering
Sell Us Your Item
For a $9.80 Gift Card
Trade in
More Buying Choices
Have one to sell? Sell yours here
Start reading Pentaho Kettle Solutions on your Kindle in under a minute.

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.
Sorry, this item is not available in
Image not available for
Color:
Image not available

To view this video download Flash Player

 

Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration [Paperback]

Matt Casters , Roland Bouman , Jos van Dongen
4.5 out of 5 stars  See all reviews (10 customer reviews)

List Price: $50.00
Price: $30.56 & FREE Shipping. Details
You Save: $19.44 (39%)
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
Temporarily out of stock.
Order now and we'll deliver when available. We'll e-mail you with an estimated delivery date as soon as we have more information. Your account will only be charged when we ship the item.
Ships from and sold by Amazon.com. Gift-wrap available.
Free Two-Day Shipping for College Students with Amazon Student

Formats

Amazon Price New from Used from
Kindle Edition $28.50  
Paperback $30.56  
Sell Back Your Copy for $9.80
No matter where you bought them, get up to 70% back when you sell your books at Amazon.com.
Used Price$21.31
Trade-in Price$9.80
Price after
Trade-in
$11.51

Book Description

September 28, 2010 0470635177 978-0470635179 1
A complete guide to Pentaho Kettle, the Pentaho Data lntegration toolset for ETL

This practical book is a complete guide to installing, configuring, and managing Pentaho Kettle. If you’re a database administrator or developer, you’ll first get up to speed on Kettle basics and how to apply Kettle to create ETL solutions—before progressing to specialized concepts such as clustering, extensibility, and data vault models. Learn how to design and build every phase of an ETL solution.

  • Shows developers and database administrators how to use the open-source Pentaho Kettle for enterprise-level ETL processes (Extracting, Transforming, and Loading data)
  • Assumes no prior knowledge of Kettle or ETL, and brings beginners thoroughly up to speed at their own pace
  • Explains how to get Kettle solutions up and running, then follows the 34 ETL subsystems model, as created by the Kimball Group, to explore the entire ETL lifecycle, including all aspects of data warehousing with Kettle
  • Goes beyond routine tasks to explore how to extend Kettle and scale Kettle solutions using a distributed “cloud”

Get the most out of Pentaho Kettle and your data warehousing with this detailed guide—from simple single table data migration to complex multisystem clustered data integration tasks.


Frequently Bought Together

Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration + Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL + Pentaho Data Integration 4 Cookbook
Price for all three: $101.11

Some of these items ship sooner than the others.

Buy the selected items together


Editorial Reviews

From the Back Cover

The ultimate resource on building and deploying data integration solutions with Kettle

Kettle is a scaleable and extensible open source ETL and data integration tool that lets you extract data from databases, flat and XML files, web services, ERP systems, and OLAP cubes. It provides over 120 built-in transformation steps to validate, cleanse, and conform data, as well as numerous options to load data into data warehouses and many other targets. Kettle is a comprehensive, low-cost alternative to traditional data integration tools like Informatica PowerCenter, IBM InfoSphere DataStage, and BusinessObjects Data Integrator.

This book explains in detail how to use Kettle to create, test, and deploy your own ETL and data integration solutions. You'll learn to use Kettle's programs to create transformations and jobs, use version control, audit data, and schedule your ETL solution. Then you'll progress to more advanced concepts such as clustering and cloud computing, real-time data integration, loading a Data Vault model, and extending Kettle by building your own plugins. In addition, you'll find hands-on examples and case studies that show exactly how to put Kettle's features into practice.

  • Explore the components of the Kettle ETL toolset

  • Discover how to install and configure Kettle and connect it to various data sources and targets

  • Design and build every aspect of an ETL solution using Kettle

  • Learn how to load a data warehouse with Kettle

  • Understand the steps for deploying and scheduling ETL solutions

  • Gain the skills to integrate Kettle with third-party products

  • Learn to extend Kettle and build your own plugins

  • Use clustering and cloud computing to scale and improve the performance of your Kettle ETL solutions

  • Find out how to use Kettle for real-time data integration

About the Author

Matt Casters is Founder of Kettle and works as Chief Data Integration at Pentaho, where he leads Kettle software development. Roland Bouman is an application developer focusing on open source web technology, databases, and business intelligence. Jos van Dongen is an independent business intelligence consultant and well-known author, analyst, and presenter.


Product Details

  • Paperback: 720 pages
  • Publisher: Wiley; 1 edition (September 28, 2010)
  • Language: English
  • ISBN-10: 0470635177
  • ISBN-13: 978-0470635179
  • Product Dimensions: 7.5 x 1.4 x 9.2 inches
  • Shipping Weight: 2.4 pounds (View shipping rates and policies)
  • Average Customer Review: 4.5 out of 5 stars  See all reviews (10 customer reviews)
  • Amazon Best Sellers Rank: #549,762 in Books (See Top 100 in Books)

Customer Reviews

4.5 out of 5 stars
(10)
4.5 out of 5 stars
Most Helpful Customer Reviews
7 of 8 people found the following review helpful
5.0 out of 5 stars It's in the book November 1, 2010
Format:Paperback|Amazon Verified Purchase
I wanted to do this review much sooner but I've been too busy using the book.

Jos and Roland have taken the proven formula they used in Pentaho Solutions and focused it on ETL and Kettle, AKA Pentaho Data Integration. Their magic formula is to seamlessly mix a product users guide with equal parts of real world examples and best practices training. With the addition of Matt Casters, Mr Kettle himself, the depth of knowledge in the book is now equal to it's breadth. The result is a book that you can read cover to cover and learn about all aspects of building and deploying ETL solutions, and is equally useful as a day to day reference.

The book is divided into five parts starting with an obligatory Getting Started. Getting Started, however, goes beyond the traditional "here's how to install it guide" and presents a nice tutorial on the sometimes confusing terminology and practices used in the data world. It explains how Kettle fits into this world and talks about the key concepts in Kettle. The first part ends with an excellent example ETL solution to populate a non trivial yet easily understood star schema. The example covers fact and dimension tables, change data capture, generating date dimensions and the ETL jobs and transforms required to populate the data.

The organization of the second part of the book is based on the 34 subsystems of ETL as defined by Ralph Kimball in "The Data Warehouse Lifecycle Toolkit", considered by many (including me) as the bible of data warehousing. For each subsystem, Kettle Solutions refers to the original chapters that describe the topic and provides examples on how to solve those issues using Kettle. It is a must have for anyone struggling with the concepts presented in the Kimble book. For the rare cases that Kettle does not have a straight forward solution, the book points you to other open source software that can get the job done. The authors stay true to the task of helping the ETL developer solve real problems regardless of whether Kettle is the complete solution or not.

The first two parts take up about half the book and if the authors stopped there, it would be worthy of at least 4 stars. But like most software development, the base code (in this case, the jobs and transforms) are the easy part and usually the most fun. The real hard stuff comes when you have deploy your solution into the real world, keep it running, add new capability, explain it to others and be confident that it is actually working. Part three walks you through the ETL lifecycle with best practices and pitfalls by 3 people who do (or have done) this for a living. Everything is covered from development and testing through documentation, monitoring, migrating and auditing. Part four finishes what part 3 started by covering performance tuning and scaling with topics like clustering, partitioning, cloud and real time ETL.

The last part is my favorite since it covers the advanced stuff like writing Kettle plugins, complex data formats, integrating data from web services, dynamic ETL, embedding Kettle, etc. There are many ways to extend Kettle via defined APIs and Kettle Solutions covers them all.

As you can probably tell I like the book and I use it often. I have the luxury of being able to ask Matt questions when I run into trouble. After writing the book, he now answers "it's in the book" and needless to say, it is. I can honestly say, having this book sitting on your desk is better than having Matt sitting on your desk. Kettle Solutions is also available for Kindle which, much to my surprise, has proven very useful. I use it from my iPhone and Mac via Kindle app and despite some of the Kindle app limitations like cut and paste and a good search, it is always available as a reference. The links are live which is a bonus.

I'm a fifteen year veteran of building BI software, one of the original Pentaho developers and am currently the Pentaho community guy. I work with Matt Casters, I'm not professionally affiliated with Jos, Roland or Wiley and receive no benefit from this book beyond the satisfaction of having Pentaho software be so well represented. I do consider all three of them good personal friends and I provide this review with the risk that it may greatly inflate their heads.

Doug Moran
Pentaho
Comment | 
Was this review helpful to you?
1 of 1 people found the following review helpful
5.0 out of 5 stars Hands on June 17, 2011
Format:Paperback|Amazon Verified Purchase
Excellent book, very much hands on, if you are sceptical about using this amazing open source ETL solution (or any sustainable open source for that matter) this book will surely put you at ease...
Comment | 
Was this review helpful to you?
5.0 out of 5 stars Must have! April 11, 2013
Format:Kindle Edition|Amazon Verified Purchase
The guys who developed the Pentaho Data Integration, aka PDI or Kettle, teamed to write a definitive book on the software. Everything you always wanted to know about PDI but didn't know you needed! Plus a Dimensional Modeling chapter written by Kimball himself and an appendix teaching the basics of Data Vault, how to create one and use it to populate a dimensional model. Buy it! It is worth much more than they are asking for!
Comment | 
Was this review helpful to you?
Most Recent Customer Reviews
3.0 out of 5 stars Already Out of Date
The book is great the problem is that Pentaho has placed updates out for their product too frequently so now this book is no longer current.
Published 3 months ago by Stephanie Dozier
4.0 out of 5 stars John
While I haven't read this book end-to-end (and never planned to), it is my main reference for everything to do with PDI. Read more
Published 4 months ago by John W Ballment
5.0 out of 5 stars Very good book. Could have been even better ...
I've given it 5 stars because for me the value I got out of it just in one chapter on the Data Vault was worth the money. Read more
Published 7 months ago by Sanjay Pande
4.0 out of 5 stars Too much Kimball's subsystems, not enough Pentaho.
Finished _Pentaho Kettle Solutions_, finding it generally OK. For me, it spends too much time covering Kimball's "34 Subsystems of ETL", fitting Pentaho into that framework. Read more
Published 11 months ago by AlanB
5.0 out of 5 stars Excelent ETL Book
This book is excelent for everyone who wants to introduce in the DWH and ETL world. I recomend it at all.
Marcos Pierri
Published 21 months ago by Marcos Pierri
4.0 out of 5 stars Good Starting Point
This book is pretty good for learning more about Pentaho. Although online sources are rich, they do not supply with enough information and I think this book is a must have. Read more
Published on January 2, 2011 by bgul
5.0 out of 5 stars A must read BI Book !
Recently, I received my own review copy of a long awaited Pentaho book : Pentaho Kettle Solutions - Building Open Source ETL Solutions with Pentaho Data Integration. Read more
Published on December 21, 2010 by Vincent Teyssier
Search Customer Reviews
Only search this product's reviews

What Other Items Do Customers Buy After Viewing This Item?


Forums

There are no discussions about this product yet.
Be the first to discuss this product with the community.
Start a new discussion
Topic:
First post:
Prompts for sign-in
 



So You'd Like to...



Look for Similar Items by Category