Data Propagation Between Heterogeneous C* Clusters (Charlie Peng, Cisco Systems)
We were facing the following challenge while using C* built-in replication mechanism for our use case to replicate data between our company (HQ) and contract manufacturing (CM) sites all over the world: Sophisticated table-level replication mechanism: 1. Both CM1 and CM2 data need to be replicated to HQ. 2. CM1 data can't be replicated to CM2 due to intellectual property concerns. 3. Both CM1 and CM2 share the same table schemas within the same keyspace as HQ for consolidated real-time analysis in HQ. 4. Selected data in HQ need to be replicated to both CM1 and CM2. We developed our own data-propagation application to solve the challenge. The application leveraged C*' own distributed data model for real-time, load-balancing, high-throughput and high-availability propagation. The session will cover the use case, the challenges, and the data-propagation application in details. We'll share our throughput metrics and the lessons we have learned through the journey. About the Speaker Charlie Peng Technical Leader, Cisco Systems Designed and deployed multiple Cassandra clusters for Cisco Supply Chain Operations to replace existing manufacturing SQL databases, and replicate data between Cisco and manufacturing partners all over the world. Leading a team for Cassandra cluster management and automation. Cisco is ranked #7 in Gartner 2016 Supply Chain Top 25.