Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Other Sellers on Amazon
+ $3.99 shipping
+ Free Shipping
Using OpenRefine Paperback – September 10, 2013
|New from||Used from|
Prepare for your professional certification with study guides and exam prep tools from Wiley. See more
Frequently bought together
Customers who bought this item also bought
About the Author
Ruben Verborgh is a PhD researcher in semantic hypermedia, and is fascinated by the Web's immense possibilities. He tries to contribute ideas that will maybe someday slightly influence the way the Web changes all of us. His degree in Computer Science Engineering convinced him more than ever that communication is the most crucial thing for IT-based solutions. This is why he really enjoys explaining things to those eager to learn. In 2011, he launched the Free Your Metadata project together with Seth van Hooland and Max De Wilde, which aims to evangelize the importance of putting your data on the Web. This book is one of the assets in this continuing quest.
Ruben currently works at Multimedia Lab, a research group of iMinds, Ghent University, Belgium, in the domains of the Semantic Web, web APIs, and adaptive hypermedia. Together with Seth van Hooland, he's currently writing Linked Data for Libraries, Archives, and Museums, a practical guide for metadata practitioners.
Max De Wilde
Max De Wilde is a PhD researcher in natural language processing and a teaching assistant at the Université Libre de Bruxelles (ULB), in the department of Information and Communication Sciences. He holds a Masters in Linguistics from the ULB and an advanced Masters in Computational Linguistics from the University of Antwerp. Currently, he is preparing a doctoral thesis on the impact of language-independent information extraction on document retrieval. At the same time, he is working as full-time assistant and supervises practical classes for Masters level students in a number of topics, including database quality, document management, and the architecture of information systems.
If you buy a new print edition of this book (or purchased one in the past), you can buy the Kindle edition for only $2.99 (Save 76%). Print edition purchase must be sold by Amazon. Learn more.
For thousands of qualifying books, your past, present, and future print-edition purchases now lets you buy the Kindle edition for $2.99 or less. (Textbooks available for $9.99 or less.)
Browse award-winning titles. See more
If you are a seller for this product, would you like to suggest updates through seller support?
Top Customer Reviews
While OpenRefine is an extremely useful "power tool for messy data", its power can be difficult to master without a great deal of trial and error on the part of the user. Part of this stems from the evolving nature of the tool. It began life as Freebase Gridworks, with the purpose of cleaning up data in order to run it against linked data in Freebase. When the Freebase parent organization was acquired by Google, they rebranded the tool as Google Refine, but as Google's priorities shifted, they stopped working on the tool and it became the open source OpenRefine. This legacy means that the tool has many pieces created by different people for different purposes. While there is quite a lot of good documentation out there on the OpenRefine site and elsewhere, this book puts it together in a easy to follow format. Like a lot of OpenRefine documentation, it is a series of "recipes" that explain how to do one specific task, but is written with the cover to cover reader in mind as well. The Google produced tutorial videos have similar coverage, but the book is more in depth, and has the advantage for readers coming from the cultural institution side of using a museum data set for examples. Another advantage is that the authors of the book have a particular interest in named entity recognition (part of the book covers the tool that one of them produced), which is particularly helpful for more abstract data sets with cultural data.
Using OpenRefine is useful for beginner or intermediate users of OpenRefine. As someone who has used OpenRefine for awhile and written about its use in libraries, this was more helpful than I expected initially, since there were pieces of functionality I'd not yet encountered in experimentation or documentation so far. My one criticism is that much of the book promises a complete explanation in the appendix of regular expressions and the Google Refine Expression Language that powers the software, but I found that the GREL documentation was less useful than I hoped, though I still learned from it. I would have preferred if that section had been earlier in the book. That aside, I would recommend this book to anyone who has been using OpenRefine or thinking about using it, and additionally for library and museum professional development collections.
This book assumes no prior knowledge of OpenRefine, but even as an advanced user I learned a few tricks I hadn't previously discovered. OpenRefine itself is an essential tool for anyone who works with large amounts of data, and anyone who needs to learn or teach OpenRefine will find this book to be a valuable addition to their library.
For non-technical users and those not used to work with data, the book helps with the steep learning curve that one can face with OpenRefine. For users that already work with data in his daily work, is a good chance to introduce OpenRefine to their data processing pipeline.
The book and all its recipes covers all the basic (and not so basic) topics of OpenRefine, so gives to the user a good knowledge of what can be accomplished with the software. After covering all the essential topics, also offers a detailed introduction to the Regular Expressions and GREL, which improves exponentially the user's hability to work with data.
Good complement to the existing OpenRefine resources that already exist on the net.