Sarah Guido, Sean O'Connor - A Tour of Large-Scale Data Analysis Tools in Python
Speakers: Sarah Guido, Sean O'Connor Large-scale data analysis is complicated. There’s a limit to how much data you can analyze on a single box, but it is relatively inexpensive to get access to a large number of commodity servers. In this tutorial, you’ll learn how to leverage the power of distributed computing tools to do large-scale data analysis quickly and affordably using pure Python, Hadoop MapReduce, and Apache Spark. Slides can be found at: https://speakerdeck.com/pycon2016 and https://github.com/PyCon/2016-slides
May 28, 2016