Building Rome Every Day - Scaling ML Model Building Infrastructure
YOW! Data 2019
"I want to reset my password". "I ordered the wrong size". "These are not the droids I was looking for". Every day, a support agent fields thousands of these queries. Multiply that by the thousands of agents a company might have, and the sheer vastness of data being generated becomes hard to imagine. How can we make sense of it all? It seems a formidable task, but we have a formidable weapon in our arsenal—we have machine learning.
By combining deep learning, natural language processing and clustering techniques, we built a machine learning model that can take 100,000 tickets and efficiently cluster and summarise them into digestible topics. But that's only part of the challenge; we also had to scale it to build for 30,000 customers, in production, every day.
In this talk I'll share the story of Content Cues - Zendesk's latest Machine Learning product. It's the story of how we leveraged the power of AWS Batch to scale a model building platform. Of how we tackled challenges such as measuring how well an unsupervised model performs when it's not even clear what "well" means. Of how our team combined our pool of skills across data engineering, data science and product management to deliver a pipeline capable of building a thousand models for the price of a cup of coffee.