Streaming data is a big deal in big data these days. As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.
Expanded from Tyler Akidau's popular blog posts "Streaming 101" and "Streaming 102", this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You'll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax.
You'll explore:
How streaming and batch data processing patterns compare
The core principles and concepts behind robust out-of-order data processing
How watermarks track progress and completeness in infinite datasets
How exactly-once data processing …
Streaming data is a big deal in big data these days. As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.
Expanded from Tyler Akidau's popular blog posts "Streaming 101" and "Streaming 102", this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You'll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax.
You'll explore:
How streaming and batch data processing patterns compare
The core principles and concepts behind robust out-of-order data processing
How watermarks track progress and completeness in infinite datasets
How exactly-once data processing techniques ensure correctness
How the concepts of streams and tables form the foundations of both batch and streaming data processing
The practical motivations behind a powerful persistent state mechanism, driven by a real-world example
How time-varying relations provide a link between stream processing and the world of SQL and relational algebra
So you wrote a book about stream processing - Yes! - And your first thought was to write 14.000 lines of LaTeX code to generate ANIMATIONS and brag about it in the introduction?! - Yes! - You wrote a book, right? - Yes! - You understand that books are pages you read? - Yes! - So your focus was animations?! - Yes!
Oh boy.
Add tons of code examples that add nothing of value because they just call some undisclosed methods and just are the same thing written as paragraph right next to it.
The lecturing is also amazingly bad. Example: the chapter „Streams and Tables“ starts with „You have reached the part of the book where we talk about streams and tables“. Well, the chapter is called that way, I would expect it to do so. Or the many times the authors pad themselves on the back with "welcome …
So you wrote a book about stream processing - Yes! - And your first thought was to write 14.000 lines of LaTeX code to generate ANIMATIONS and brag about it in the introduction?! - Yes! - You wrote a book, right? - Yes! - You understand that books are pages you read? - Yes! - So your focus was animations?! - Yes!
Oh boy.
Add tons of code examples that add nothing of value because they just call some undisclosed methods and just are the same thing written as paragraph right next to it.
The lecturing is also amazingly bad. Example: the chapter „Streams and Tables“ starts with „You have reached the part of the book where we talk about streams and tables“. Well, the chapter is called that way, I would expect it to do so. Or the many times the authors pad themselves on the back with "welcome back to me, the last chapter was amazing, right? Because the other author of this book is sooo great". Yuck.