| Digital List Price: | $67.99 |
| Kindle Price: | $42.99 Save $25.00 (37%) |
| Sold by: | Amazon.com Services LLC |
Your Memberships & Subscriptions
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Follow the authors
OK
Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing 1st Edition, Kindle Edition
Streaming data is a big deal in big data these days. As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.
Expanded from Tyler Akidau’s popular blog posts "Streaming 101" and "Streaming 102", this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You’ll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax.
You’ll explore:
- How streaming and batch data processing patterns compare
- The core principles and concepts behind robust out-of-order data processing
- How watermarks track progress and completeness in infinite datasets
- How exactly-once data processing techniques ensure correctness
- How the concepts of streams and tables form the foundations of both batch and streaming data processing
- The practical motivations behind a powerful persistent state mechanism, driven by a real-world example
- How time-varying relations provide a link between stream processing and the world of SQL and relational algebra
- ISBN-13978-1491983874
- Edition1st
- PublisherO'Reilly Media
- Publication dateJuly 16, 2018
- LanguageEnglish
- File size14074 KB
Kindle E-Readers
- Kindle Paperwhite
- Kindle Paperwhite (5th Generation)
- Kindle Touch
- Kindle Voyage
- Kindle
- Kindle Oasis
- All new Kindle paperwhite
- All New Kindle E-reader
- Kindle Oasis (9th Generation)
- Kindle Paperwhite (10th Generation)
- Kindle Paperwhite (11th Generation)
- All New Kindle E-reader (11th Generation)
- Kindle Scribe (1st Generation)
- Kindle (10th Generation)
- Kindle Oasis (10th Generation)
Fire Tablets
Free Kindle Reading Apps
Customers who read this book also read
From the brand
-
-
Sharing the knowledge of experts
O'Reilly's mission is to change the world by sharing the knowledge of innovators. For over 40 years, we've inspired companies and individuals to do new things (and do them better) by providing the skills and understanding that are necessary for success.
Our customers are hungry to build the innovations that propel the world forward. And we help them do just that.
Editorial Reviews
About the Author
Slava Chernyak is a senior software engineer at Google Seattle. Slava spent over five years working on Google’s internal massive-scale streaming data processing systems and has since become involved with designing and building Windmill, Google Cloud Dataflow's next-generation streaming backend, from the ground up. Slava is passionate about making massive-scale stream processing available and useful to a broader audience. When he is not working on streaming systems, Slava is out enjoying the natural beauty of the Pacific Northwest.
Tyler Akidau is a senior staff software engineer at Google, where he is the technical lead for the Data Processing Languages & Systems group, responsible for Google's Apache Beam efforts, Google Cloud Dataflow, and internal data processing tools like Google Flume, MapReduce, and MillWheel. His also a founding member of the Apache Beam PMC. Though deeply passionate and vocal about the capabilities and importance of stream processing, he is also a firm believer in batch and streaming as two sides of the same coin, with the real endgame for data processing systems the seamless merging between the two. He is the author of the 2015 Dataflow Model paper and the Streaming 101 and Streaming 102 articles on the O’Reilly website. His preferred mode of transportation is by cargo bike, with his two young daughters in tow.
Product details
- ASIN : B07FMDY5CC
- Publisher : O'Reilly Media; 1st edition (July 16, 2018)
- Publication date : July 16, 2018
- Language : English
- File size : 14074 KB
- Simultaneous device usage : Unlimited
- Text-to-Speech : Enabled
- Enhanced typesetting : Enabled
- X-Ray : Not Enabled
- Word Wise : Not Enabled
- Print length : 352 pages
- Page numbers source ISBN : 1491983876
- Best Sellers Rank: #313,815 in Kindle Store (See Top 100 in Kindle Store)
- #29 in Distributed Systems & Computing
- #138 in Business Software
- #148 in Software Development (Kindle Store)
- Customer Reviews:
About the authors

Discover more of the author’s books, see similar authors, read book recommendations and more.

Discover more of the author’s books, see similar authors, read book recommendations and more.
Products related to this item
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonCustomers say
Customers find the book's content informative and educational on engineering fundamentals of streaming systems. However, some readers feel some topics are poorly explained or need better background, and it can be difficult to distinguish genuine complexity from poor editing.
AI-generated from the text of customer reviews
Customers like the content. However, some reviewers mention that the figures are poorly printed and the formatting is poor.
"The content in this book is good, but the formatting makes it very painful to read...." Read more
"The content seems good but it was printed so poorly the figures are unreadable." Read more
"Good product" Read more
"Great read about the glory of the majestic brown trout..." Read more
Customers have mixed reviews about the content quality. Some find it informative and educational on engineering fundamentals of streaming systems, with good concepts that expand prior knowledge. However, others feel some topics are poorly explained or need better background, making it difficult to distinguish genuine complexity from poor editing. The labels and diagrams make it difficult at times to distinguish genuine complexity from sloppy editing, and the charts are really confusing.
"...The charts are really confusing...." Read more
"...with lambda, unfamiliar to streaming, this book has some very good concepts that expanded the prior knowledge of batch programming!..." Read more
"...nuts and bolts of designing streaming systems, the topics are presented in such a confusing way that I've had to go back and reread pages over and..." Read more
"Very educational on the engineering fundamentals of streaming systems." Read more
Reviews with images
The Best Book About Streaming System, and it's more than that
-
Top reviews
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
- Reviewed in the United States on August 21, 2018Just could not put down the book until I finished it...
I've built Big Data analysis systems for years, using Hadoop MapReduce on HBase, at the same time, we build Streaming System using Apache Storm(something following the Lambda Architecture pattern), and further extend to Trident to ensure exactly-once data processing. Later we moved to Spark so that we can use one implementation to do both batch and streaming... I'm so interested in streaming systems that I've also learned Flink and Akka Streaming in my own private time, and I feel I'm an expert for streaming systems or big data analysis systems, until I read this Book... I really know nothing...
This Book lead you a way to jump out of the details, it shall build a framework in higher concept level in your mind, for you to understand and reasoning for streaming system, SQL system, and batch processing systems; and then you will find the essential same/difference between batch/Stream/table/SQL...etc, and understand those things which seem to locate in totally different dimensions, by only one simple mental framework; and then, you will find you can think in a higher level dimension space and attack your problem with a dimension reduction weapon, and it's really cool.
It will change your mind on "what is big data processing".
5.0 out of 5 stars The Best Book About Streaming System, and it's more than thatJust could not put down the book until I finished it...
Reviewed in the United States on August 21, 2018
I've built Big Data analysis systems for years, using Hadoop MapReduce on HBase, at the same time, we build Streaming System using Apache Storm(something following the Lambda Architecture pattern), and further extend to Trident to ensure exactly-once data processing. Later we moved to Spark so that we can use one implementation to do both batch and streaming... I'm so interested in streaming systems that I've also learned Flink and Akka Streaming in my own private time, and I feel I'm an expert for streaming systems or big data analysis systems, until I read this Book... I really know nothing...
This Book lead you a way to jump out of the details, it shall build a framework in higher concept level in your mind, for you to understand and reasoning for streaming system, SQL system, and batch processing systems; and then you will find the essential same/difference between batch/Stream/table/SQL...etc, and understand those things which seem to locate in totally different dimensions, by only one simple mental framework; and then, you will find you can think in a higher level dimension space and attack your problem with a dimension reduction weapon, and it's really cool.
It will change your mind on "what is big data processing".
Images in this review
- Reviewed in the United States on November 11, 2020What did you like or dislike?
Familiar with lambda, unfamiliar to streaming, this book has some very good concepts that expanded the prior knowledge of batch programming!
What did you use this product for?
Learning/Expanding purposes
- Reviewed in the United States on January 11, 2021I almost finished part I of the book and I'm quite disappointed.
After every chapter I thought "well, I didn't really enjoy this one, but next one surely will be better?" and it just never happened.
The book is written by 3 authors, 1 of which uses the prose language as if he is writing a novel. I really dislike this writing style, I just want to understand the streaming systems better. Please use your eloquent language for other purposes, but not for tech writing.
Overly simple concepts are explained in an overly complex language. I don't want to copy book contents here, but I will show the level of examples given in the book. If mobile phones generate events with numbers 1, 2, 3, 4 then they can arrive to our pipeline in any order, and - CAN YOU BELIEVE THAT?? - they always sum to 10.
Really? That's just obvious and you don't have to make up complex terminology to explain this simple thing. Please rather explain to us how to deal with data that can not arrive from the phone, or how to estimate lateness of that data.
The charts are really confusing. The X-axis for some reason is event-time (in my opinion, it would have been much easier to understand if processing time is put as an X-axis instead; author even mentions that there was similar feedback from other people but they just don't feel it's worth updating that!!!) - and to make things more complicated X-axis and Y-axis start at different times. So what should be a 0-crossing diagonal becomes a diagonal shifted on X-axis. I just loose my mental processing power adjusting the charts to the way they should have been printed to start with - and at that point I have very little capacity to reason about what is actually going on in the provided example.
I feel I learned very little new things from this book, if any at all.
I have prior experience working with streaming systems - and I guess that made me understand something written in this book, but otherwise I doubt I would be able to understand much.
If part II enlightens me I will go back and change this review, but so far it's just a disappointment and waste of time.
- Reviewed in the United States on October 23, 2019The content in this book is good, but the formatting makes it very painful to read. There are many charts that are printed tiny with dotted lines and the reader is expected to type in a URL to view them online. Why not just print them bigger instead of packing 6 to 8 of them on one page? My other gripe is the use of yellow font color on white background. It's extremely difficult to read, as is the orange text with yellow highlight background color. I have mild color deficiency which doesn't help either, but come on guys, there are plenty of darker colors than yellow that could have be used.
- Reviewed in the United States on July 21, 2019On top of everything you learn about streaming data systems it was so much fun getting into how modern systems evolved and where some of the ideas came from
- Reviewed in the United States on April 1, 2021Very educational on the engineering fundamentals of streaming systems.
- Reviewed in the United States on February 7, 2021A+
- Reviewed in the United States on May 2, 2023The author likes to use phrases and sentences which are difficult to read and comprehend to explain simple concepts. Streaming systems is not some advanced algorithms which takes efforts to understand.
On the other hand, it's pretty common nowadays that people tend to exaggerate the complexity of their work in the software engineering domain. Most of times I finish reading some book or articles with a "Meh" rather than "Woah".
Top reviews from other countries
-
gschnyderReviewed in Spain on October 26, 20222.0 out of 5 stars El contenido excelente. La impresión un desastre.
La calidad de la impresión del libro es lamentable.
Dicho esto, el contenido es de primer nivel.
The SeekerReviewed in Canada on July 24, 20195.0 out of 5 stars Buy this book
Having read Tyler's famous Streaming 101/102 blog posts and having watched his presentations on youtube, I did not think I would get much out of this book. I was wrong. Tyler is the Edgar F. Codd of streaming systems.
ShettiReviewed in India on December 23, 20191.0 out of 5 stars Wrong Book: DO NOT BUY, Return is impossible
DO NOT BUY. This is not a review of the book but the print that CB-India is selling. It is misprinted. Most of the book is typescript and not streaming systems. I have been trying to return the book but Amazon has been canceling the return. So very corrupt!
One person found this helpfulReport-
RavenReviewed in Spain on August 31, 20202.0 out of 5 stars La calidad de la impresión deja mucho que desear
Me ha llegado hoy el libro, puedo decir que la edición del libro es de muy baja calidad para su precio. Por 50 euros espero hojas plastificadas y gráficas definidas. Como se ve en las imágenes las gráficas están borrosas. Parece un pdf impreso con muy poca definición. Si el libro tuviera un valor de 20 euros lo entendería.
Me ha llegado hoy el libro, puedo decir que la edición del libro es de muy baja calidad para su precio. Por 50 euros espero hojas plastificadas y gráficas definidas. Como se ve en las imágenes las gráficas están borrosas. Parece un pdf impreso con muy poca definición. Si el libro tuviera un valor de 20 euros lo entendería.2.0 out of 5 stars La calidad de la impresión deja mucho que desear
Raven
Reviewed in Spain on August 31, 2020
Images in this review
Himanshu SachdevaReviewed in India on November 16, 20191.0 out of 5 stars Printing issue: Two half books merged into one
From Page 33 onwards, the pages are from a different book. They actually talk about javascript and typescript.
Even it's different from index. Need to return the book.
From Page 33 onwards, the pages are from a different book. They actually talk about javascript and typescript.1.0 out of 5 stars Printing issue: Two half books merged into one
Himanshu Sachdeva
Reviewed in India on November 16, 2019
Even it's different from index. Need to return the book.
Images in this review
One person found this helpfulReport





