Taming the Beast: Automated Testing for Complex Data Pipelines

YOW! Data 2019

Massive datasets. Complex data pipelines. Machine learning. When faced with such a beast, how do you test it effectively? When your tests results are less "pass" and "fail", and more "sort of" and "not really", how do you automate testing?

Trish Khoo draws upon her experience in testing complex data systems to demonstrate proven strategies for testing in this field. Her experience working on ultra-large-scale systems at Google in Mountain View, California shaped her technical approach to testing which she applies in her work as a consultant today.