Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Pentaho Data Integration 4 Cookbook Paperback – June 23, 2011
|New from||Used from|
Frequently bought together
Customers who bought this item also bought
About the Author
Adrián Sergio Pulvirenti
Adrián was born in Buenos Aires, Argentina, in 1972. He earned his Bachelor degree in Computer Sciences at UBA, one of the most prestigious universities in South America.
He has dedicated more than fifteen years to developing desktop and web-based software solutions. Over the last few years he has been leading integration projects and the development of BI solutions.
María Carina Roldán
María Carina was born in Esquel, Argentina, in 1970. She earned her Bachelor degree in Computer Science at UNLP in La Plata and then moved to Buenos Aires where she has lived since 1994.
She has worked as a BI consultant for more than ten years. Over the last four, she has been dedicated full time to developing BI solutions using Pentaho Suite. Currently she works for Webdetails, one of the main Pentaho contributors.
She is the author of Pentaho 3.2 Data Integration: Beginner's Guide published by Packt Publishing in April 2010.
If you buy a new print edition of this book (or purchased one in the past), you can buy the Kindle edition for only $2.99 (Save 86%). Print edition purchase must be sold by Amazon. Learn more.
For thousands of qualifying books, your past, present, and future print-edition purchases now lets you buy the Kindle edition for $2.99 or less. (Textbooks available for $9.99 or less.)
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
Kettle itself is intuitive enough to learn, so this book could serve as a good resource even for Kettle novices. (They'll have to self-study other materials, perhaps the product documentation, to get off the ground.) Once a basic level of expertise is obtained, the patterns and practices given in this book will be of use.
Use cases for common scenarios are well represented. (Examples: How to read data from a database, dealing with fixed format and comma delimited files, working with XML, consuming a web service, generating reports.) These were all expected so no extra credit for these topics, though it's nice to have them all documented in one place for future reference. There are also quite a few recipes given for things I'd never before encountered like parsing of unstructured files (i.e. a Log4j log file), writing out JSON, producing Cartesian products given two lists, and matching values using fuzzy comparison logic. These topics were pleasant surprises to find, I can imagine practical uses for many of them. As an experienced ETL user, I can assure you anyone doing real production work with an ETL tool will find a few things of value here.
If you have a need for integration work and don't enjoy a lot of low-level coding, you probably owe it to yourself to try Kettle or another ETL product. If you're using ETL for anything beyond dirt-simple scenarios, you'll probably save yourself some time and effort by reviewing the best practices contained here.
This versatile tool is a must for all people working with data integration.
Transformations and jobs are the target in PDI to realize a task including data reading, writing, manipulations and integrations, doing mathematical or logical
operations, all this is tipical of a ETL tool (where ETL stands for Extract, Transform and Load).
Do you need to move data from an excel file to a database, from a database to a text file?
Do you need to extract data from a LDAP server, FTP, mail, log file, compressed file, web service or web site?
All this must be done regularly, automatically?
Would it be cool to be notified by email if the process failed?
Sure you can do it in a lot of ways, but an ETL tool gives you the necessary help.
In addition an open source ETL, like Pentaho Data Integration, has behind a strong and skilled community to help you.
This book provides a lot of step-by-step examples (called "recipes") with a lot of practical, useful and very smart hints and strategies for developing transformations and jobs.
New steps (a step a is basic task, for example reading from a file, sorting , grouping, calculating, ...) are very well described and explained
Chapters of this book cover deeply all you need to know to understand the software and be ready to write your own transformations and be quickly productive.
I found very useful the space dedicated to:
- read and write file: unstructured and structured text files, excel and openoffice spreadsheets
- XML files and validation with DTD and XSD Schemas
- use fuzzy match step
- reuse and flexibility of trasformations (name parameters, variable, mapping)
- sending email with log log about the status of the execution
- file management: retrieve file from server like FTP, copying, moving, deleting, comparing
- integration of Kettle with Pentaho Suite (Pentaho Reporting Engine)
The way all these subjects are explained is progressive and gradual. The use of targeted examples makes the reading very pleasant and easy.
I suggest this book to you.
Of particular interest for advanced users will be the last three chapters, which discuss how to integrate PDI with the rest of the Pentaho Business Intelligence suite of tools, reusing transformations and jobs, and showing how to collect metadata on the processes being created in those transformations and jobs. If the reader finds building the recipes taking up too much time, the full set of code is available on the publisher's website as well as sample database sets on which the recipies are built.
In short, PDI Cookbook is another great reference book for Pentaho Data Integrator which fills a gap that was not covered with the Beginner's Guide or Pentaho Kettle Solutions.
Disclaimer: In the essence of full disclosure, Packt Publishing asked me to write this review and offered a copy of one of their other published works for my trouble. This in no way has changed my opinion on PDI Cookbook.