Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Data Crunching: Solve Everyday Problems Using Java, Python, and more. Paperback – April 20, 2005
See the Best Books of 2018 So Far
Looking for something great to read? Browse our editors' picks for the best books of the year so far in fiction, nonfiction, mysteries, children's books, and much more.
About the Author
Greg Wilson holds a Ph.D. in Computer Science from the University of Edinburgh, and has worked on high-performance scientific computing, data visualization, and computer security. He is the author of Practical Parallel Programming (MIT Press, 1995), and is a contributing editor at Doctor Dobb's Journal, and an adjunct professor in Computer Science at the University of Toronto.
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
If you've read any of the O'Reilly cookbook series, you will know what to expect, although the chapters are more cohesive and less episodic. Beginning programmers will get the most out of this book, although intermediate programmers should find at least some material here that's new to them.
The XML chapter is a pretty good introduction the use and advantages/disadvantages of SAX and DOM, and XSLT is also described, although the discussion is not so clear. Those without experience with databases will welcome the chapter on SQL. The discussion on dealing with plain text files in chapter 1 was highlight for me, a subject not often covered in much depth in cookbooks; if, like me, you still regularly need to convert between various plain text formats, this chapter will help formalise approaches that you may already be carrying out in a less than rigorous fashion.
Additionally, the paragraphs on floating point arithmetic were intriguing but all too brief. The chapter on dealing with binary is fairly good, although rather dry. Peter Seibel's discussion of binary data in the context of writing a Shoutcast server in Practical Common Lisp shows that the subject can be dealt with in a more compelling fashion. That said, for the most part, author Greg Wilson is a genial companion; the writing style is chatty, but doesn't overdo it.
Overall, if you own any cookbook-style books, there is little here that you don't already know. Even for a beginner, it's hard to see how anyone who decides they need this book hasn't already been exposed to some of the material here. In particular, does anyone really need yet another introduction to regular expressions? The treatment here isn't bad, it's just that this material is already covered in many introductory programming books (especially those that cover scripting languages like Perl and Python). As this takes up nearly 20% of the book, and there's less than 200 pages, it's a bit of a waste. Personally, I would have preferred more discussion of the less well-treated subjects, some of which are too sparsely described, but this would have detracted from the book's main aim.
This would be suitable for a beginner Pythonista, who for some reason didn't want the bulk of the likes of Python Cookbook. Otherwise, if you feel that some Pragmatic Programmers books can be rather lightweight and somewhat overpriced, this will not change your mind.
The core of programming comes down to data manipulation. This may be parsing XML, reformatting text data, searching a database, or any other number of a host of tasks. Typically, figuring out how to do each of these would require digesting several books in order to just get to the nuts and bolts of simple operations. "Data Crunching" fills this hole by concisely presenting the minimum amount of information required to get the job done. Just the information you need to know to get rolling, without all the fluff.
There are chapters on manipulating text files, XML documents, binary data, and relational databases. Included is a nice chapter on regular expressions, as well as a chapter on various "glue" topics relevant to solving data manipulation problems. Each chapter examines the tools and methods used to successfully manipulate the format of data being discussed. The examples used, and the book is chock full of them, are practical and relevant to the problems most often faced by developers. The examples are clearly illustrated and easy to follow.
Wilson does a fine job of presenting things in the "pragmatic" style that readers familiar with other books in the series have come to know. Each chapter stands well on its own, so the book may be used as a reference, although it's concise and a pleasant enough read that it's also worth reading through once. Great for the new developer who hasn't yet gotten his feet wet with data manipulation, yet also a nice reference for those who have been around the block a bit more, "Data Crunching" makes a fine addition to the Pragmatic series and is definitely worth having on the bookshelf.
Most recent customer reviews
It's also nice to see the Java equivalent app/code for the python solution.Read more
The book opens with a statement of purpose: transmuting data from one form into another.Read more