Shop top categories that ship internationally

These promotions will be applied to this item:

Some promotions may be combined; others are not eligible to be combined with other offers. For details, please see the Terms & Conditions associated with these promotions.

eBook features:
  • Highlight, take notes, and search in the book
  • In this edition, page numbers are just like the physical edition
You've subscribed to ! We will preorder your items within 24 hours of when they become available. When new books are released, we'll charge your default payment method for the lowest price available during the pre-order period.
Update your device or payment method, cancel individual pre-orders or your subscription at
Your Memberships & Subscriptions

Buy for others

Give as a gift or purchase for a team or group.
Learn more

Buying and sending eBooks to others

  1. Select quantity
  2. Buy and send eBooks
  3. Recipients can read on any device

These ebooks can only be redeemed by recipients in the US. Redemption links and eBooks cannot be resold.

Added to

Sorry, there was a problem.

There was an error retrieving your Wish Lists. Please try again.

Sorry, there was a problem.

List unavailable.
Kindle app logo image

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.

Read instantly on your browser with Kindle for Web.

Using your mobile phone camera - scan the code below and download the Kindle app.

QR code to download the Kindle App

Follow the authors

Something went wrong. Please try your request again later.

Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing 1st Edition, Kindle Edition

4.2 4.2 out of 5 stars 125 ratings

Streaming data is a big deal in big data these days. As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.

Expanded from Tyler Akidau’s popular blog posts "Streaming 101" and "Streaming 102", this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You’ll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax.

You’ll explore:

  • How streaming and batch data processing patterns compare
  • The core principles and concepts behind robust out-of-order data processing
  • How watermarks track progress and completeness in infinite datasets
  • How exactly-once data processing techniques ensure correctness
  • How the concepts of streams and tables form the foundations of both batch and streaming data processing
  • The practical motivations behind a powerful persistent state mechanism, driven by a real-world example
  • How time-varying relations provide a link between stream processing and the world of SQL and relational algebra
Due to its large file size, this book may take longer to download

From the brand

Editorial Reviews

About the Author

Reuven Lax is a senior staff software engineer at Google Seattle, and has spent the past nine years helping to shape Google's data processing and analysis strategy. For much of that time he has focused on Google's low-latency, streaming data processing efforts, first as a long-time member and lead of the MillWheel team, and more recently founding and leading the team responsible for Windmill, the next-generation stream processing engine powering Google Cloud Dataflow. He's very excited to bring Google's data-processing experience to the world at large, and proud to have been a part of publishing both the MillWheel paper in 2013 and the Dataflow Model paper in 2015. When not at work, Reuven enjoys swing dancing, rock climbing, and exploring new parts of the world.

Slava Chernyak is a senior software engineer at Google Seattle. Slava spent over five years working on Google’s internal massive-scale streaming data processing systems and has since become involved with designing and building Windmill, Google Cloud Dataflow's next-generation streaming backend, from the ground up. Slava is passionate about making massive-scale stream processing available and useful to a broader audience. When he is not working on streaming systems, Slava is out enjoying the natural beauty of the Pacific Northwest.

Tyler Akidau is a senior staff software engineer at Google, where he is the technical lead for the Data Processing Languages & Systems group, responsible for Google's Apache Beam efforts, Google Cloud Dataflow, and internal data processing tools like Google Flume, MapReduce, and MillWheel. His also a founding member of the Apache Beam PMC. Though deeply passionate and vocal about the capabilities and importance of stream processing, he is also a firm believer in batch and streaming as two sides of the same coin, with the real endgame for data processing systems the seamless merging between the two. He is the author of the 2015 Dataflow Model paper and the Streaming 101 and Streaming 102 articles on the O’Reilly website. His preferred mode of transportation is by cargo bike, with his two young daughters in tow.

Product details

  • ASIN ‏ : ‎ B07FMDY5CC
  • Publisher ‏ : ‎ O'Reilly Media; 1st edition (July 16, 2018)
  • Publication date ‏ : ‎ July 16, 2018
  • Language ‏ : ‎ English
  • File size ‏ : ‎ 14074 KB
  • Simultaneous device usage ‏ : ‎ Unlimited
  • Text-to-Speech ‏ : ‎ Enabled
  • Enhanced typesetting ‏ : ‎ Enabled
  • X-Ray ‏ : ‎ Not Enabled
  • Word Wise ‏ : ‎ Not Enabled
  • Print length ‏ : ‎ 352 pages
  • Page numbers source ISBN ‏ : ‎ 1491983876
  • Customer Reviews:
    4.2 4.2 out of 5 stars 125 ratings

About the authors

Follow authors to get new release updates, plus improved recommendations.

Customer reviews

4.2 out of 5 stars
125 global ratings

Review this product

Share your thoughts with other customers

Customers say

Customers find the book's content informative and educational on engineering fundamentals of streaming systems. However, some readers feel some topics are poorly explained or need better background, and it can be difficult to distinguish genuine complexity from poor editing.

AI-generated from the text of customer reviews

Select to learn more
4 customers mention "Content"4 positive0 negative

Customers like the content. However, some reviewers mention that the figures are poorly printed and the formatting is poor.

"The content in this book is good, but the formatting makes it very painful to read...." Read more

"The content seems good but it was printed so poorly the figures are unreadable." Read more

"Good product" Read more

"Great read about the glory of the majestic brown trout..." Read more

8 customers mention "Content quality"3 positive5 negative

Customers have mixed reviews about the content quality. Some find it informative and educational on engineering fundamentals of streaming systems, with good concepts that expand prior knowledge. However, others feel some topics are poorly explained or need better background, making it difficult to distinguish genuine complexity from poor editing. The labels and diagrams make it difficult at times to distinguish genuine complexity from sloppy editing, and the charts are really confusing.

"...The charts are really confusing...." Read more

"...with lambda, unfamiliar to streaming, this book has some very good concepts that expanded the prior knowledge of batch programming!..." Read more

"...nuts and bolts of designing streaming systems, the topics are presented in such a confusing way that I've had to go back and reread pages over and..." Read more

"Very educational on the engineering fundamentals of streaming systems." Read more

The Best Book About Streaming System, and it's more than that
5 out of 5 stars
The Best Book About Streaming System, and it's more than that
Just could not put down the book until I finished it...I've built Big Data analysis systems for years, using Hadoop MapReduce on HBase, at the same time, we build Streaming System using Apache Storm(something following the Lambda Architecture pattern), and further extend to Trident to ensure exactly-once data processing. Later we moved to Spark so that we can use one implementation to do both batch and streaming... I'm so interested in streaming systems that I've also learned Flink and Akka Streaming in my own private time, and I feel I'm an expert for streaming systems or big data analysis systems, until I read this Book... I really know nothing...This Book lead you a way to jump out of the details, it shall build a framework in higher concept level in your mind, for you to understand and reasoning for streaming system, SQL system, and batch processing systems; and then you will find the essential same/difference between batch/Stream/table/SQL...etc, and understand those things which seem to locate in totally different dimensions, by only one simple mental framework; and then, you will find you can think in a higher level dimension space and attack your problem with a dimension reduction weapon, and it's really cool.It will change your mind on "what is big data processing".
Thank you for your feedback
Sorry, there was an error
Sorry we couldn't load the review

Top reviews from the United States

  • Reviewed in the United States on August 21, 2018
    Just could not put down the book until I finished it...

    I've built Big Data analysis systems for years, using Hadoop MapReduce on HBase, at the same time, we build Streaming System using Apache Storm(something following the Lambda Architecture pattern), and further extend to Trident to ensure exactly-once data processing. Later we moved to Spark so that we can use one implementation to do both batch and streaming... I'm so interested in streaming systems that I've also learned Flink and Akka Streaming in my own private time, and I feel I'm an expert for streaming systems or big data analysis systems, until I read this Book... I really know nothing...
    This Book lead you a way to jump out of the details, it shall build a framework in higher concept level in your mind, for you to understand and reasoning for streaming system, SQL system, and batch processing systems; and then you will find the essential same/difference between batch/Stream/table/SQL...etc, and understand those things which seem to locate in totally different dimensions, by only one simple mental framework; and then, you will find you can think in a higher level dimension space and attack your problem with a dimension reduction weapon, and it's really cool.

    It will change your mind on "what is big data processing".
    Customer image
    5.0 out of 5 stars The Best Book About Streaming System, and it's more than that
    Reviewed in the United States on August 21, 2018
    Just could not put down the book until I finished it...

    I've built Big Data analysis systems for years, using Hadoop MapReduce on HBase, at the same time, we build Streaming System using Apache Storm(something following the Lambda Architecture pattern), and further extend to Trident to ensure exactly-once data processing. Later we moved to Spark so that we can use one implementation to do both batch and streaming... I'm so interested in streaming systems that I've also learned Flink and Akka Streaming in my own private time, and I feel I'm an expert for streaming systems or big data analysis systems, until I read this Book... I really know nothing...
    This Book lead you a way to jump out of the details, it shall build a framework in higher concept level in your mind, for you to understand and reasoning for streaming system, SQL system, and batch processing systems; and then you will find the essential same/difference between batch/Stream/table/SQL...etc, and understand those things which seem to locate in totally different dimensions, by only one simple mental framework; and then, you will find you can think in a higher level dimension space and attack your problem with a dimension reduction weapon, and it's really cool.

    It will change your mind on "what is big data processing".
    Images in this review
    Customer image
    Customer image
    28 people found this helpful
    Report
  • Reviewed in the United States on November 11, 2020
    What did you like or dislike?
    Familiar with lambda, unfamiliar to streaming, this book has some very good concepts that expanded the prior knowledge of batch programming!

    What did you use this product for?
    Learning/Expanding purposes
  • Reviewed in the United States on January 11, 2021
    I almost finished part I of the book and I'm quite disappointed.
    After every chapter I thought "well, I didn't really enjoy this one, but next one surely will be better?" and it just never happened.

    The book is written by 3 authors, 1 of which uses the prose language as if he is writing a novel. I really dislike this writing style, I just want to understand the streaming systems better. Please use your eloquent language for other purposes, but not for tech writing.

    Overly simple concepts are explained in an overly complex language. I don't want to copy book contents here, but I will show the level of examples given in the book. If mobile phones generate events with numbers 1, 2, 3, 4 then they can arrive to our pipeline in any order, and - CAN YOU BELIEVE THAT?? - they always sum to 10.
    Really? That's just obvious and you don't have to make up complex terminology to explain this simple thing. Please rather explain to us how to deal with data that can not arrive from the phone, or how to estimate lateness of that data.

    The charts are really confusing. The X-axis for some reason is event-time (in my opinion, it would have been much easier to understand if processing time is put as an X-axis instead; author even mentions that there was similar feedback from other people but they just don't feel it's worth updating that!!!) - and to make things more complicated X-axis and Y-axis start at different times. So what should be a 0-crossing diagonal becomes a diagonal shifted on X-axis. I just loose my mental processing power adjusting the charts to the way they should have been printed to start with - and at that point I have very little capacity to reason about what is actually going on in the provided example.

    I feel I learned very little new things from this book, if any at all.
    I have prior experience working with streaming systems - and I guess that made me understand something written in this book, but otherwise I doubt I would be able to understand much.

    If part II enlightens me I will go back and change this review, but so far it's just a disappointment and waste of time.
    8 people found this helpful
    Report
  • Reviewed in the United States on October 23, 2019
    The content in this book is good, but the formatting makes it very painful to read. There are many charts that are printed tiny with dotted lines and the reader is expected to type in a URL to view them online. Why not just print them bigger instead of packing 6 to 8 of them on one page? My other gripe is the use of yellow font color on white background. It's extremely difficult to read, as is the orange text with yellow highlight background color. I have mild color deficiency which doesn't help either, but come on guys, there are plenty of darker colors than yellow that could have be used.
    3 people found this helpful
    Report
  • Reviewed in the United States on July 21, 2019
    On top of everything you learn about streaming data systems it was so much fun getting into how modern systems evolved and where some of the ideas came from
    One person found this helpful
    Report
  • Reviewed in the United States on April 1, 2021
    Very educational on the engineering fundamentals of streaming systems.
  • Reviewed in the United States on February 7, 2021
  • Reviewed in the United States on May 2, 2023
    The author likes to use phrases and sentences which are difficult to read and comprehend to explain simple concepts. Streaming systems is not some advanced algorithms which takes efforts to understand.
    On the other hand, it's pretty common nowadays that people tend to exaggerate the complexity of their work in the software engineering domain. Most of times I finish reading some book or articles with a "Meh" rather than "Woah".
    One person found this helpful
    Report

Top reviews from other countries

Translate all reviews to English
  • gschnyder
    2.0 out of 5 stars El contenido excelente. La impresión un desastre.
    Reviewed in Spain on October 26, 2022
    La calidad de la impresión del libro es lamentable.

    Dicho esto, el contenido es de primer nivel.
  • The Seeker
    5.0 out of 5 stars Buy this book
    Reviewed in Canada on July 24, 2019
    Having read Tyler's famous Streaming 101/102 blog posts and having watched his presentations on youtube, I did not think I would get much out of this book. I was wrong. Tyler is the Edgar F. Codd of streaming systems.
  • Shetti
    1.0 out of 5 stars Wrong Book: DO NOT BUY, Return is impossible
    Reviewed in India on December 23, 2019
    DO NOT BUY. This is not a review of the book but the print that CB-India is selling. It is misprinted. Most of the book is typescript and not streaming systems. I have been trying to return the book but Amazon has been canceling the return. So very corrupt!
    One person found this helpful
    Report
  • Raven
    2.0 out of 5 stars La calidad de la impresión deja mucho que desear
    Reviewed in Spain on August 31, 2020
    Me ha llegado hoy el libro, puedo decir que la edición del libro es de muy baja calidad para su precio. Por 50 euros espero hojas plastificadas y gráficas definidas. Como se ve en las imágenes las gráficas están borrosas. Parece un pdf impreso con muy poca definición. Si el libro tuviera un valor de 20 euros lo entendería.
    Customer image
    Raven
    2.0 out of 5 stars La calidad de la impresión deja mucho que desear
    Reviewed in Spain on August 31, 2020
    Me ha llegado hoy el libro, puedo decir que la edición del libro es de muy baja calidad para su precio. Por 50 euros espero hojas plastificadas y gráficas definidas. Como se ve en las imágenes las gráficas están borrosas. Parece un pdf impreso con muy poca definición. Si el libro tuviera un valor de 20 euros lo entendería.
    Images in this review
    Customer image Customer image
    Customer imageCustomer image
  • Himanshu Sachdeva
    1.0 out of 5 stars Printing issue: Two half books merged into one
    Reviewed in India on November 16, 2019
    From Page 33 onwards, the pages are from a different book. They actually talk about javascript and typescript.
    Even it's different from index. Need to return the book.
    Customer image
    Himanshu Sachdeva
    1.0 out of 5 stars Printing issue: Two half books merged into one
    Reviewed in India on November 16, 2019
    From Page 33 onwards, the pages are from a different book. They actually talk about javascript and typescript.
    Even it's different from index. Need to return the book.
    Images in this review
    Customer image
    Customer image
    One person found this helpful
    Report

Report an issue


Does this item contain inappropriate content?
Do you believe that this item violates a copyright?
Does this item contain quality or formatting issues?