Buy new:
-8% $42.19$42.19
Delivery Friday, July 5
Ships from: Amazon Sold by: USA AURORA
Save with Used - Good
$12.50$12.50
Delivery Wednesday, July 3
Ships from: Amazon Sold by: Jenson Books Inc
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Follow the authors
OK
The Fourth Paradigm: Data-Intensive Scientific Discovery 1st Edition
Purchase options and add-ons
- ISBN-100982544200
- ISBN-13978-0982544204
- Edition1st
- PublisherMicrosoft Research
- Publication dateOctober 16, 2009
- LanguageEnglish
- Dimensions7 x 0.67 x 10 inches
- Print length284 pages
Customers who bought this item also bought
Data Mining for the Masses, Third Edition: With Implementations in RapidMiner and RMatthew NorthPaperback$17.86 shipping
Product details
- Publisher : Microsoft Research; 1st edition (October 16, 2009)
- Language : English
- Paperback : 284 pages
- ISBN-10 : 0982544200
- ISBN-13 : 978-0982544204
- Item Weight : 1.8 pounds
- Dimensions : 7 x 0.67 x 10 inches
- Best Sellers Rank: #1,867,151 in Books (See Top 100 in Books)
- #833 in Scientific Research
- #6,343 in Computer Science (Books)
- Customer Reviews:
About the authors

Discover more of the author’s books, see similar authors, read author blogs and more

Discover more of the author’s books, see similar authors, read author blogs and more
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on Amazon-
Top reviews
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
The Fourth Paradigm is a collection of papers talks on research areas that aim to improve the research cycle. The talks are a memorial to Microsoft Tech Fellow Jim Gray. Gray had the insight that science has gone through four paradigms so far. The first paradigm, which has lasted over the last few thousand years, was empirical science which describes natural phenomena. Over the last few hindered years, the second paradigm of theoretical science using models and generalizations has occurred. Within the last 50 to 70 years, the third paradigm of computational science has developed to simulate complex phenomena. Finally, the fourth paradigm (also known as eScience) has developed to unify theory, experiment, and simulation. Jim Gray says:"... it is worth distinguishing data-intensive science from computational science as a new, fourth paradigm for scientific exploration."
The book itself is divided into four major sections: Earth and Environment, Health and Wellbeing, Scientific Infrastructure, and Scholarly Communications with 6 to 8 papers per section. The emphasis here is on science; however, I'd assert that all these areas directly impact engineering as well. For example, the flight test of a new product involves an enormous amount of data, which produces much analysis, knowledge, and understanding. The principle idea of eScience (and eEngineering) is that the data and analysis interoperate with each other, such that information is at your fingertips for everyone, everywhere. The payoff is a large increase in information velocity and productivity. In the end, an analysis or report will be an overlay on the data. I have seen this start to happen, and agree with Jim Gray that our current tools are very primitive - a lot of new tools are going to be required.
A paper that I found particularly interesting was "Discovering the Wiring of the Brain" Their summary is: "Decoding the complete connectome of the human brain is one of the great challenges of the 21st century." I agree - and discovering the scientific and engineering applications that will emerge will be even more of a challenge. This is an area that requires an entire new way to handle all the data - consider that a snapshot 1 cubic mm of image data from a human brain contains a petabyte of data, and that a human brain contains about one million cubic mm.
This fascinating book is availed for free download at the Microsoft Research website at [...]/
I think any working scientist or engineer will find much to learn and think about in this collection of papers on the emerging Fourth Paradigm and the world of eScience (and eEnginnering)
The book presents the issues in a comprehensive and interesting fashion.
However, all of the papers were top-down overviews. I wanted to dig into some case studies. For example, Microsoft has a working project: World Wide Telescope. How many data sources do they use? How do they blend data from conflicting sources? How do they curate the data? How much telescope gear (and how much computer hardware and software) would I need to contribute? None of the essays went into details on these projects.
Several papers did make some useful, interesting points.
Much of scientific research today is cottage industry: one group puts together some instruments, gathers data, analyzes it, publishes paper. A revolution akin to the industrial revolution will happen: specialized groups will operate instruments and publish data; other groups will analyze the data.
The data repositories of the future must accommodate large numbers of disparate groups gathering data -- and the scientific community must reward them. Data organization, provision of metadata, provenance are all big unsolved questions. (I'd have liked more detailed information here, too).
Some scientific instruments collect data so fast that the bottleneck is no longer data acquisition but data interpretation. Similarly, data repositories are so large that making copies of the dataset is expensive -- it will actually be cheaper for data repositories to offer services where researchers run custom programs against the data.
This high-level overview is grand, but it's hard to test. Surely these pronouncements are based on experience in actual scientific projects. I wanted to read more at this lower level.

