Data is a Soft Science
YOW! Data 2018
The perception of data science, and often the way it is taught, is like this: You have some nice, tidy data, you use the latest, coolest algorithm, and you get some super clever results. You know it’s good ‘cause your r-squared value is through the roof, and you could play checkers on your confusion matrix.
But the reality is different. That nice, tidy dataset has to be wrangled out of a big, nasty production system that was built by a coffee-fueled maniac. Those cool results have to somehow be translated into a user interface in which it’s the 12th most important thing on the page, and you have to fight for every pixel. And in front of that production system, entering that data, clicking on that user interface, is a data scientist’s worst nightmare: People.
As much as we might want to believe that data science is a pure “hard” science, about writing greek letters on chalkboards and stroking our chins, the truth is that what we do is more usefully thought of as a social science. Data science is a lens for understanding human behaviour. It is a tool for communicating with people. Data is a soft science.
This talk is about how my background in Social Anthropology gave me a unique approach to doing data science. I’ll show how taking this view of data science led to some cool discoveries in some interesting projects. And I’ll talk about how, building accounting software at Xero, we’ve started on the journey towards building a “smarter” application. As we've done this, the hardest problems have not been about technical implementation, they’ve been about understanding the interface between these technologies and our users. Our data science problems at Xero, it turns out, are mostly about how to understand humans.