Dr. Erin LeDell - Scalable Automatic Machine Learning in R
Scalable Automatic Machine Learning in R by Dr. Erin LeDell. Visit https://rstats.ai/nyr/ to learn more. Abstract: The focus of this presentation is scalable and automatic machine learning in R using the H2O machine learning platform. H2O is an open source, distributed machine learning platform is designed to scale to very large datasets that may not fit into RAM on a single machine. We will provide a brief overview of the field of Automatic Machine Learning, followed by a detailed look inside H2O's AutoML algorithm. H2O AutoML provides an easy-to-use interface which automates data pre-processing, training and tuning a large selection of candidate models (including multiple stacked ensemble models for superior model performance), and due to the distributed nature of the H2O platform, H2O AutoML can scale to very large datasets. The result of the AutoML run is a "leaderboard" of H2O models which can be easily exported for use in production. Bio: Dr. Erin LeDell is the Chief Machine Learning Scientist at H2O.ai, where she leads the development of an open source, automatic machine learning (AutoML) platform. Before joining H2O.ai, she worked as a data scientist and software engineer and founded DataScientific, Inc. She's also the founder of the Women in Machine Learning & Data Science (wimlds.org) organization and co-founder of the R-Ladies Global (rladies.org) organization. She received her Ph.D. from UC Berkeley where her research focused on machine learning and statistical computing. Twitter: https://twitter.com/ledell Presented at the 2020 R Conference | New York (August 15th, 2020)