Go to content

TimeWindowCompactionStrategy for Time Series Workloads (Jeff Jirsa, Crowdstrike)

Cassandra is a great fit for high write use cases, which makes it a popular choice for storing time series and sensor-collection workloads. At Crowdstrike, we've been using Cassandra for just that purpose, collecting petabytes of expiring time series data. In this talk, I'll discuss compaction in time series workloads, and the TimeWindowCompactionStrategy we developed specifically for this purpose. I'll detail TWCS specific configuration properties, some lesser known compaction sub-properties that apply to all compaction strategies, and also cover other general tricks and tuning that are useful for very large time-series workloads. About the Speaker Jeff Jirsa Principal Engineer, Infrastructure, Crowdstrike A 2015-2016 Cassandra MVP and member of Crowdstrikes Infrastructure Services Team, Jeffs primary focus is managing the clusters that collect billions of events per day to enable Crowdstrike to defend its clients against hostile adversaries. Jeff is also the maintainer of TimeWindowCompactionStrategy, an open source compaction strategy used by a number of large, high-write clusters for more efficient time-series based Cassandra workloads.

July 26, 2016