Go to content

Real-time Data Integration w Kafka & Cassandra (Ewen Cheslack-Postava, Confluent)

Slides: https://www.slideshare.net/DataStax/realtime-data-integration-with-kafka-and-cassandra-ewen-cheslackpostava-confluent-c-summit-2016 | Apache Kafka is a high throughput messaging system that companies like LinkedIn, Netflix, and AirBnB are adopting to handle massive real-time datasets. These datasets originate from dozens of systems -- from databases like Cassandra, to log files, to application data. And companies often need to adopt just as many tools to integrate that data for processing. This presentation introduces Kafka Connect, Kafka's new tool for scalable, fault-tolerant data import and export. We'll discuss existing tools in the space and how they fall short when applied to real-time data integration at scale. Then we'll explore Kafka Connect's design and how it compares to systems with similar goals, including key design decisions and tradeoffs. Finally, we'll discuss the current support for Cassandra connectors and how they can be combined with other connectors and stream processing frameworks to help you get more out of your data. About the Speaker Ewen Cheslack-Postava Engineer, Confluent, Inc. Ewen Cheslack-Postava is a Kafka committer and engineer at Confluent building a stream data platform based on Apache Kafka to help organizations reliably and robustly capture and leverage all their real-time data.

July 26, 2016