1 Billion Black Friday Shoppers on Distributed Data (Fahd Siddiqui, Bazaarvoice)
Slides: https://www.slideshare.net/DataStax/one-billion-black-friday-shoppers-on-a-distributed-data-store-fahd-siddiqui-bazaarvoice-cassandra-summit-2016 | EmoDB is an open source RESTful data store built on top of Cassandra that stores JSON documents and, most notably, offers a databus that allows subscribers to watch for changes to those documents in real time. It features massive non-blocking global writes, asynchronous cross data center communication, and schema-less json content. For non-blocking global writes, we created a ""JSON delta"" specification that defines incremental updates to any json document. Each row, in Cassandra, is thus a sequence of deltas that serves as a Conflict-free Replicated Datatype (CRDT) for EmoDB's system of record. We introduce the concept of ""distributed compactions"" to frequently compact these deltas for efficient reads. Finally, the databus forms a crucial piece of our data infrastructure and offers a change queue to real time streaming applications. About the Speaker Fahd Siddiqui Lead Software Engineer, Bazaarvoice Fahd Siddiqui is a Lead Software Engineer at Bazaarvoice in the data infrastructure team. His interests include highly scalable, and distributed data systems. He holds a Master's degree in Computer Engineering from the University of Texas at Austin, and frequently talks at Austin C* User Group. About Bazaarvoice: Bazaarvoice is a network that connects brands and retailers to the authentic voices of people where they shop. More at www.bazaarvoice.com