Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft)
Slides: https://www.slideshare.net/DataStax/running-400node-cassandra-spark-clusters-in-azure-anubhav-kale-microsoft-c-summit-2016 | We run multiple DataStax Enterprise clusters in Azure each holding 300 TB+ data to deeply understand Office 365 users. In this talk, we will deep dive into some of the key challenges and takeaways faced in running these clusters reliably over a year. To name a few: process crashes, ephemeral SSDs contributing to data loss, slow streaming between nodes, mutation drops, compaction strategy choices, schema updates when nodes are down and backup/restore. We will briefly talk about our contributions back to Cassandra, and our path forward using network attached disks offered via Azure premium storage. About the Speaker Anubhav Kale Sr. Software Engineer, Microsoft Anubhav is a senior software engineer at Microsoft. His team is responsible for building big data platform using Cassandra, Spark and Azure to generate per-user insights of Office 365 users.