Programming Books C Java PHP Python Learn more Browse Programming Books
Qty:1
  • List Price: $39.99
  • Save: $13.30 (33%)
FREE Shipping on orders over $35.
In Stock.
Ships from and sold by Amazon.com.
Gift-wrap available.
Add to Cart
FREE Shipping on orders over $35.
Used: Like New | Details
Sold by bookrampage
Condition: Used: Like New
Access codes and supplements are not guaranteed with used items.
Add to Cart
Trade in your item
Get a $7.50
Gift Card.
Have one to sell? Sell on Amazon
Flip to back Flip to front
Listen Playing... Paused   You're listening to a sample of the Audible audio edition.
Learn more
See all 2 images

Data Analysis with Open Source Tools Paperback – November 25, 2010

ISBN-13: 978-0596802356 ISBN-10: 0596802358 Edition: 1st

Buy New
Price: $26.69
52 New from $20.98 32 Used from $16.91
Rent from Amazon Price New from Used from
eTextbook
"Please retry"
$8.36
Paperback
"Please retry"
$26.69
$20.98 $16.91

Free%20Two-Day%20Shipping%20for%20College%20Students%20with%20Amazon%20Student



Frequently Bought Together

Data Analysis with Open Source Tools + Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython + Doing Data Science: Straight Talk from the Frontline
Price for all three: $78.60

Buy the selected items together

NO_CONTENT_IN_FEATURE
Shop the new tech.book(store)
New! Introducing the tech.book(store), a hub for Software Developers and Architects, Networking Administrators, TPMs, and other technology professionals to find highly-rated and highly-relevant career resources. Shop books on programming and big data, or read this week's blog posts by authors and thought-leaders in the tech industry. > Shop now

Product Details

  • Paperback: 540 pages
  • Publisher: O'Reilly Media; 1 edition (November 25, 2010)
  • Language: English
  • ISBN-10: 0596802358
  • ISBN-13: 978-0596802356
  • Product Dimensions: 9.2 x 7 x 1.4 inches
  • Shipping Weight: 2.2 pounds (View shipping rates and policies)
  • Average Customer Review: 4.2 out of 5 stars  See all reviews (38 customer reviews)
  • Amazon Best Sellers Rank: #379,869 in Books (See Top 100 in Books)

Editorial Reviews

Book Description

A hands-on guide for programmers and data scientists

About the Author

After previous careers in physics and softwaredevelopment, Philipp K. Janert currentlyprovides consulting services for data analysis,algorithm development, and mathematical modeling.He has worked for small start-ups and in largecorporate environments, both in the U.S. andoverseas. He prefers simple solutions that workto complicated ones that don't, and thinks thatpurpose is more important than process. Philippis the author of "Gnuplot in Action - UnderstandingData with Graphs" (Manning Publications), and haswritten for the O'Reilly Network, IBM developerWorks,and IEEE Software. He is named inventor on a handfulof patents, and is an occasional contributor to CPAN.He holds a Ph.D. in theoretical physics from theUniversity of Washington. Visit his company websiteat www.principal-value.com.


More About the Author

Discover books, learn about writers, read author blogs, and more.

Customer Reviews

Data Analysis with Open Source Tools is an excellent book for experienced analysts of data.
John Brady
Overall the book is very well balanced between theory and practice and after the first few chapters I didn't have a single boring moment while reading the book.
David Karapetyan
As with many good books, you get the sense the author is a co-worker, trying to explain something to you in terms you can understand.
Jim McGaw

Most Helpful Customer Reviews

202 of 221 people found the following review helpful By J. Felipe Ortega Soto on February 7, 2011
Format: Paperback
This book is aimed at offering a practical, hands-on introduction to data analysis for pragmatic readers without strong scientific or statistical background. Some basic programming experience is required. The author provides many personal (and sometimes useful) comments about different tools and procedures in data analysis.

However, a careful reading reveals many problems, specially an obscure presentation of key concepts. In my opinion, the target audience for this book would be people without previous contact with data analysis. Hence the importance of presenting its core elements correctly. Otherwise, it's useless for them.

In particular:

- Few pages are actually dedicated to present open source tools supporting the different graphs and techniques included in the book. From the title, I expected a more complete tour through available open source tools for data analysis.

- No clues about how to obtain most of the graphs and results presented in the book. No related data sets are available for download, either. A book like this is useless if we cannot learn how to replicate all the examples.

- The formula of the variance for a sample is just wrong. One must divide by n-1 and not n; see "Applied Statistics and Probability for Engineers" (Montgomery and Runger 2006).

- The author presents one of the most obscure explanations for the median I've ever come across. Recurring to an RFC (RFC 2330) to explain such a simple concept is really awkward.

- In chapter 3 and Appendix B, natural logarithms (base e) are presented in the text, while graphs plot powers of 10. Definitely, not the right way to transmit correct concepts and methods.
Read more ›
9 Comments Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
39 of 39 people found the following review helpful By Code Monkey on April 16, 2011
Format: Paperback
This book covers such a wide range of topics that it necessarily skims over all of them but it always hits all the major points that an introductory survey should. Each chapter has a straight forward tone, strikes the right balance between developing mathematical rigor and developing an intuitive understanding of data , and undeniably passes on the lessons of hard earned, real world experience. But a reader who is actually working on a real data problem will almost certainly come to the realization that the understanding gained is somewhat superficial - that it's going to take a lot more heavy reading (probably of books, papers, and software tools recommended in this book) to get any real work done!

The single biggest problem with this book is its misleading title. This book is not going to teach you how to use open source software to analyze data. There is only minimal information about how one would actually use the software tools being discussed. What you get is a brief commentary about what the author thinks each software package is good for. It's the same story as with the mathematical details: you will not find them here, but this book will give you an excellent idea of what to look for. So in the end it does leave you feeling just a little bit cheated, even though all the advice you got seems extremely well informed.

What this book does astonishingly well is communicate an attitude to data analysis that most textbooks (and nearly all the college courses I took) seem to miss. Nearly every chapter is a stream of stunningly insightful observations on how to approach data, without the mathematical detail that overwhelms most practicing programmers.
Read more ›
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
42 of 45 people found the following review helpful By Peter Alfheim on January 27, 2011
Format: Paperback
The book is very good for the intermediate-to-advanced data analysts. Beginners beware: there are some important prerequisites that are not obvious before you buy it, and there are some organization problems.

First, the prerequisites. "I strongly recommend that you make it a habit to avoid all statistical language"..."Once we start talking about standard deviations, the clarity is gone." These are two sentences in the same passage from the Preface. The rest of that passage is similar. However, even the first chapters make heavy use of statistical language. Moreover, they assume that you already know statistics to the level of density estimation, noise, splines, and regression. Page 21 even features a footnote about the Fourier transform and Fourier convolution theorem. Clearly this book is not for the statistically-shy or for mathematically-shy in general, no matter what the Preface suggests. You also need to know Python and R.

Second, the chapter organization problems. There's a mismatch between the first part of each chapter, which introduces concepts and techniques, and the Workshop part of the same chapter, which uses software. I was expecting the Workshop to illustrate the implementation of the same concepts and techniques. It's not really so. The Workshop introduces Python and R facilities at a different (lower) speed than the rest of the chapter. One could even wonder why the Workshop is in the same chapter. I'd rather that each chapter consisted of a few detailed case studies that first introduce concepts and techniques and then illustrate them with software libraries.
1 Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again

Customer Images

Most Recent Customer Reviews

Search