Join Newsletter

Search at Scale: Using Machine Learning to Automate Content Metadata

YOW! Data 2019

For media organisations, reach is everything. Getting eyeballs and ears in front of content is their raison d'être.

Search plays a critical role in connecting audiences with t-1 content (yesterday's news, last week's podcast). However, with audience expectations conditioned by Google and others, it is challenging to deliver robust, scalable search that people actually want to use.

The relevance of your results is everything, and to produce relevant results you need good metadata for every object in your search index. With hundreds of thousands of content objects and an audience of millions, the ABC has unique challenges in this regard.

This talk will explore the ABC's use of Machine Learning (ML) to automatically generate meaningful metadata for pieces of content (audio/video/text), including AWS MLaaS for full transcripts of audio podcasts and a platform developed in-house for NLP tasks such as entity recognition and automated document summarisation, and image-related tasks such as segmentation and tagging.

Gareth Seneque

Machine Learning Engineer

Australian Broadcasting Corporation


I am an engineer with a a decade+ of experience in the media & finance industries.

My primary technical interests are machine learning/data science, as well as the infrastructure required for the distributed training/deployment/serving of models.

I'm currently working at the Australian Broadcasting Corporation, on our new search product. It is accessible at