Dive into Spark Streaming by Gerard Maas
Apache Spark is a distributed computing framework that enables scalable, high-throughput, and fault-tolerant processing of data. Spark Streaming delivers the power of Spark to process streams of data in near real-time. After a quick recon of the surface, in this talk we are going to dive into Spark Streaming functional and operational aspects. Through several examples, we will explore the Spark Streaming API, will discuss some of the challenges of processing streaming data in real-time and will provide a clear understanding of the internal processes that are key for the successful production deployment of a Spark Streaming application.
November 9, 2015