Go to content

Distributed Real Time Stream Processing: Why and How by Petr Zapletal

This talk was recorded at Scala Days New York, 2016. Follow along on Twitter @scaladays and on the website for more information http://scaladays.org/. In this talk, we are going to discuss various state of the art open-source distributed streaming frameworks, their similarities and differences, implementation trade-offs, their intended use-cases, and how to choose between them. I’m going to focus on the popular frameworks including Spark Streaming, Storm, Samza and Flink. In addition, I'll cover theoretical introduction, common pitfalls, popular architectures and many more. The demand for stream processing is increasing. Immense amounts of data has to be processed fast from a rapidly growing set of disparate data sources. This pushes the limits of traditional data processing infrastructures. These stream-based applications include trading, social networks, Internet of things or system monitoring, are becoming more and more important. A number of powerful, easy-to-use open source platforms have emerged to address this. My goal is to provide comprehensive overview about modern streaming solutions and to help fellow developers with picking the best possible decision for their particular use-case. This talk should be interesting for anyone who is thinking about, implementing, or have already deployed streaming solution.

May 9, 2016