Cluster-wide Scaling of Machine Learning with Ray
YOW! Data 2020
Popular ML techniques like Reinforcement learning (RL) and Hyperparameter Optimization (HPO) require a variety of computational patterns for data processing, simulation (e.g., game engines), model search, training, and serving, and other tasks. Few frameworks efficiently support all these patterns, especially when scaling to clusters.
Ray is an open-source, distributed framework from U.C. Berkeley’s RISELab that easily scales applications from a laptop to a cluster. It was created to address the needs of reinforcement learning and hyperparameter tuning, in particular, but it is broadly applicable for almost any distributed Python-based application, with support for other languages forthcoming.
I'll explain the problems Ray solves and how Ray works. Then I'll discuss RLlib and Tune, the RL and HPO systems implemented with Ray. You'll learn when to use Ray versus alternatives, and how to adopt it for your projects.
Head of Developer Relations
Dean Wampler is an expert in streaming data systems, focusing on applications of ML/AI. He is head of evangelism at Anyscale.io, which is focused on distributed Python for ML/AI. Previously, he was an engineering VP at Lightbend, where he led the development of Lightbend CloudFlow, an integrated system for building and running streaming data applications with popular open source tools. Dean is the author or co-author of three O’Reilly books on Scala, Functional Programming, and Hive. He contributes to several open source projects and he co-organizes and speaks at many technology conferences and Chicago-based user groups. Dean has a Ph.D. in Physics from the University of Washington.