Join Newsletter

Conference Program

All times displayed are in the Australia/Sydney timezone

8:00 AM

8:00 AM - 45 mins

Registration for YOW! Data 2018

8:45 AM

8:45 AM - 15 mins

Session Overviews & Introductions

9:00 AM

9:00 AM - 50 mins

Wesley Theatre

Stream All the Things!!

Dean Wampler

Stream All the Things!!

Dean Wampler

Streaming data architectures aren't just "faster" Big Data architectures. They must be reliable and scalable as never before, more like microservice architectures.

This talk has three goals:

  1. Justify the transition from batch-oriented big data to stream-oriented fast data.
  2. Explain the requirements that streaming architectures must meet and the tools and techniques used to meet them.
  3. Discuss the ways that fast data and microservice architectures are converging.

Big data started with an emphasis on batch-oriented architectures, where data is captured in large, scalable stores, then processed using batch jobs. To reduce the gap between data arrival and information extraction, these architectures are now evolving to be stream oriented, where data is processed as it arrives. Fast data is the new buzz word.

These architectures introduce new challenges for developers. Whereas a batch job might run for hours, a stream processing system typically runs for weeks or months, which raises the bar for making these systems reliable and scalable to handle any contingency.

The microservice world has faced this challenge for a while. Microservices are inherently message driven, responding to requests for service and sending messages to other microservices, in turn. Hence, they are also stream oriented, in the sense that they must respond reliably to never-ending input. So, they offer guidance for how to build reliable streaming data systems. I'll discuss how these architectures are merging in other ways, too.

We'll also discuss how to pick streaming technologies based on four axes of concern:

  • Low latency: What's my time budget for handling this data?
  • High volume: How much data per unit time must I handle?
  • Data processing: Do I need machine learning, SQL queries, conventional ETL processing, etc.?
  • Integration with other tools: Which ones and how is data exchanged between them?

We'll consider specific examples of streaming tools and how they fit on these axes, including Spark, Flink, Akka Streams, and Kafka.

Read More

9:50 AM

9:50 AM - 30 mins

Wesley Theatre

Data is a Soft Science

Simon Carryer

Data is a Soft Science

Simon Carryer

The perception of data science, and often the way it is taught, is like this: You have some nice, tidy data, you use the latest, coolest algorithm, and you get some super clever results. You know it’s good ‘cause your r-squared value is through the roof, and you could play checkers on your confusion matrix.

But the reality is different. That nice, tidy dataset has to be wrangled out of a big, nasty production system that was built by a coffee-fueled maniac. Those cool results have to somehow be translated into a user interface in which it’s the 12th most important thing on the page, and you have to fight for every pixel. And in front of that production system, entering that data, clicking on that user interface, is a data scientist’s worst nightmare: People.

As much as we might want to believe that data science is a pure “hard” science, about writing greek letters on chalkboards and stroking our chins, the truth is that what we do is more usefully thought of as a social science. Data science is a lens for understanding human behaviour. It is a tool for communicating with people. Data is a soft science.

This talk is about how my background in Social Anthropology gave me a unique approach to doing data science. I’ll show how taking this view of data science led to some cool discoveries in some interesting projects. And I’ll talk about how, building accounting software at Xero, we’ve started on the journey towards building a “smarter” application. As we've done this, the hardest problems have not been about technical implementation, they’ve been about understanding the interface between these technologies and our users. Our data science problems at Xero, it turns out, are mostly about how to understand humans.

Read More

10:20 AM

10:20 AM - 25 mins

Morning Break

10:45 AM

10:45 AM - 30 mins

Wesley Theatre

Machine Learning Systems for Engineers

Cameron Joannidis

Machine Learning Systems for Engineers

Cameron Joannidis

Machine Learning is often discussed in the context of data science, but little attention is given to the complexities of engineering production ready ML systems. This talk will explore some of the important challenges and provide advice on solutions to these problems.

Read More

11:15 AM

11:15 AM - 30 mins

Wesley Theatre

What happens when Galactic Evolution and Data Science collide?

Elaina Hyde

What happens when Galactic Evolution and Data Science collide?

Elaina Hyde

This talk will cover a short trip around our Milky Way Galaxy and a discussion of how data science can be used to detect faint and sparse objects such as the dwarf satellites and streams that helped form the galaxy we live in. The data science applications and algorithms used determine the accuracy with which we can make detections of these mysterious bodies and with the advent of greater cloud computing capability the sky is no longer the limit when it comes to programming or Astronomy

Read More

11:45 AM

11:45 AM - 30 mins

Wesley Theatre

The 80/20 of UX research: What qualitative data to use and when

Tash Keuneman

The 80/20 of UX research: What qualitative data to use and when

Tash Keuneman

We'll talk about the one qualitative process that you should use 80% of the time. Learn how to translate your customer research into meaningful numbers that you can make product decisions on.

How many customers do you have to talk to find 85% of the usability problems? How do you avoid the most common pitfalls that result in bad data? We'll go through that as well as the best practices you can use to confidentially calculate the extent of the problems you've found.

You'll walk out of the conference room with UX skills that you can put into practice right away and a gift bag to boot.

Read More

12:15 PM

12:15 PM - 50 mins

Lunch Break

1:05 PM

1:05 PM - 30 mins

Wesley Theatre

How to Save a Life: Could Real-Time Sensor Data Have Saved Mrs Elle?

Dana Bradford

How to Save a Life: Could Real-Time Sensor Data Have Saved Mrs Elle?

Dana Bradford

This is the story of Mrs Elle*, a participant in a smart home pilot study. The pilot study was aimed to test the efficacy of sensors to capture in-home activity data including meal preparation, attention to hygiene and movement around the house. The in-home monitoring and response service associated with the sensors had not been implemented, and as such, data was not analyzed in real time. Sadly, Mrs Elle suffered a massive stroke one night, and was found some time after. She later died in hospital without regaining consciousness. This paper looks at the data leading up to Mrs Elle’s stroke, to see if there were any clues that a neurological insult was imminent. We were most interested to know, had we been monitoring in real time, could the sensors have told us how to save a life?

*pseudonym

Read More

1:35 PM

1:35 PM - 30 mins

Wesley Theatre

Machine learning applications for the autonomous/connected vehicle : perspectives, applications and methods

Boris Savkovic

Machine learning applications for the autonomous/connected vehicle : perspectives, applications and methods

Boris Savkovic

The application of streaming and real-time data science/analytics to connected and autonomous vehicles is gaining traction around the world. Intelematics is an Australian leader and innovator in the field of telematics/connected vehicles as well as in big data traffic analytics, with Intelematics services used by Australian and overseas giants such as Ford, Toyota, Google etc.

The topic of the talk is the application of streaming and big data analytics to autonomous and connected vehicles. Applications covered will include : ability to predict vehicle failures, forecast traffic conditions, automate vehicle insurance claims with automated crash/incident detection, deliver data into the vehicle (traffic signal states and forecasts) etc.

The talk will give an outline of general trends as well as give some examples of concrete solutions that we have developed at Intelematics in this emerging field, both in Australia and overseas (US and EU) . The focus will be on the application of data science and algortihms, and the key role that these have to play in the emerging field of connected and autonomous vehicles. These relevant data streams bring a number of technical and non-technical challenges that will be discussed : complexity of dealing with geo-spatial and temporal data, safety and security, privacy, streaming nature of data, event-driven nature data, volume of data, complexity of relationships/patterns to be modelled etc.

The underlying algorithms and techonlogies will be also be discussed in some detail.

Link to company website :

http://www.intelematics.com/

Speaker bio :

https://au.linkedin.com/in/borislav-savkovic-23969154

Read More

2:05 PM

2:05 PM - 30 mins

Wesley Theatre

Using Sentiment Analysis To Fill In The Gaps From User Surveys

Gareth Jones

Using Sentiment Analysis To Fill In The Gaps From User Surveys

Gareth Jones

We put a year's worth of online help chat logs from a major Australian Superannuation website through Google's Natural Language API, to see what insights we could gain from the users. This talk will discuss how the Natural Language API works, and the underlying machine learning concepts, and also give you some ideas on how to make use of the information based on examples from our work. We'll compare the sentiment values with those expressed in exit surveys and find out how useful an indicator it can be.

Read More

2:35 PM

2:35 PM - 20 mins

Afternoon Break

2:55 PM

2:55 PM - 30 mins

Wesley Theatre

Deep Learning, Production and You

Wai Chee Yau and Jeffrey Theobald

Deep Learning, Production and You

Wai Chee Yau and Jeffrey Theobald

Simply building a successful machine learning product is extremely challenging, and just as much effort is needed to turn that model into a customer-facing product. Drawing on their experience working on Zendesk’s article recommendation product, Wai Chee Yau and Jeffrey Theobald discuss design challenges and real-world problems you may encounter when building a machine learning product at scale.

Wai Chee and Jeffrey cover the evolution of the machine learning system, from individual models per customer (using Hadoop to aggregate the training data) to a universal deep learning model for all customers using TensorFlow, and outline some challenges they faced while building the infrastructure to serve TensorFlow models. They also explore the complexities of seamlessly upgrading to a new version of the model and detail the architecture that handles the constantly changing collection of articles that feed into the recommendation engine.

Topics include:

  • Infrastructure for continuously changing textual data
  • Deploying and serving TensorFlow models in production
  • Real-world production problems when dealing with a machine learning model
  • Data, customer feedback, and user experience
Read More

3:25 PM

3:25 PM - 30 mins

Wesley Theatre

The catastrophic consequences of not being awesome at plots

Hercules Konstantopoulos

The catastrophic consequences of not being awesome at plots

Hercules Konstantopoulos

“Hey boss, we made a totally rad deep learning algorithm! It trawls through the internet and literally tells the future!”

“Yeah, cool. But do you have a plot I can show the board?”

Data visualisation is the capstone of data science. As businesses collate massive and disparate data streams, and as algorithms become more complex, communicating results has become more important and more challenging than even before. We need to to start placing as much importance on accessible visualisation as we do on database architecture or algorithm design.

In this talk I will present some core visualisation principles that I have developed over 15 years of experience visualising (literally) astronomical datasets, carbon emissions, behavioural analytics, even the odd basketball game. These can be employed by any data scientist to help their data tell the right story to the right people.

Read More

3:55 PM

3:55 PM - 30 mins

Wesley Theatre

Graph Neural Networks: Algorithm and Applications

Shujia Zhang

Graph Neural Networks: Algorithm and Applications

Shujia Zhang

Artificial neural networks help us cluster and classify. Since "Deep learning" became the buzzword, it has been applied for many advances of AI, such as self-driving car, image classification, Alpha Go, etc. There are lots of different deep learning architectures, the most popular ones are based on the well known convolutional neural network which is one type of feed-forward neural networks. This talk will introduce another variant of deep neural network - Graph Neural network which can model the data represented as generic graphs (a graph can have labelled nodes connected via weighted edges). The talk will cover:

  • the graph (graph of graphs - GoGs) representation: how we represent different data with graphs
  • architecture of graph neural networks (GNN): the architecture of deep graph neural networks and learning algorithm
  • applications of GoGs and GNNs: document classification, web spam detection, human action recognition in video

Read More

4:25 PM

4:25 PM - 20 mins

Afternoon Break

4:45 PM

4:45 PM - 30 mins

Wesley Theatre

Immersive Analytics of Honey Bee Data

Huyen Nguyen

Immersive Analytics of Honey Bee Data

Huyen Nguyen

Bees are dying – in recent years an unprecedented decline in honey bee colonies has been seen around the globe. The causes are still largely unknown. At CSIRO, the Global Initiative for Honey bee Health (GIHH) is an international collaboration of researchers, beekeepers, farmers, and industry set up to research the threats to bee health in order to better understand colony collapse and find solutions that will allow for sustainable crop pollination and food security. Integral to the research effort is RFID tags that are manually fitted to bees. The abundance of data being collected by the thousands of bee-attached sensors as well as additional environmental sensors poses several challenges regarding the interpretation and comprehension of the data, both computationally as well as from a user perspective. In this talk, I will discuss visual analytics techniques that we have been investigating to facilitate an effective path from data to insight. I will particularly focus on interactive and immersive user interfaces that allow for a range of end users to effectively explore the complex sensor data.

Read More

5:15 PM

5:15 PM - 30 mins

Wesley Theatre

Visual Analytics on Steroids: High Performance Visualisation, Simulations and AI

Tomasz Bednarz

Visual Analytics on Steroids: High Performance Visualisation, Simulations and AI

Tomasz Bednarz

In the time that someone takes to read this abstract, another could solve a detective puzzle if only they had enough quantitative evidence on which to prove their suspicions. But also, one could use visualisation and computational tools like a microscope, to seek a new cure for cancer or predict hospitalisation prevention. In this presentation, we will demonstrate new visual analytics techniques that use various mixed reality approaches that link simulations with collaborative, complex and interactive data exploration, placing the human-in-the-loop. In the recent days, thanks to advances in graphics hardware and compute power (especially GPGPU and modern Big Data / HPC infrastructures), the opportunities are immense, especially in improving our understanding of complex models that represent the real- or hybrid-worlds. Use cases presented will be drawn from ongoing research at CSIRO, and Expanded Perception and Interaction Centre (EPICentre) using world class GPU clusters and visualisation capabilities.

Read More

5:45 PM

5:45 PM - 45 mins

Wesley Theatre

The Zen of Data Science

Dr Eugene Dubossarsky

The Zen of Data Science

Dr Eugene Dubossarsky

What makes data science such a different field ? Why is it such a challenge to structure, manage and capture the value of data science ?
This presentation will focus on the key issues around the practice of discovery, (“science”) and how it differs from the practice of building new things (“engineering”).
Other questions addressed will include:
How do organisations manage and leverage the value of data science ? What are the key “unknown unknowns” that managers miss so often, with disastrous results ?
What are the cultural, procedural, managerial differences between “scientists” and engineers in a modern, data-driven workplace ?
Read More

6:30 PM

6:30 PM - 60 mins

Conference Drinks & Networking

8:45 AM

8:45 AM - 15 mins

Session Overviews & Introductions

9:00 AM

9:00 AM - 50 mins

Wesley Theatre

Building a culture of experimentation at Pinterest

Andrea Burbank

Building a culture of experimentation at Pinterest

Andrea Burbank

A successful experimentation program consists of much more than mere randomization and measurement. How do you help stakeholders understand the right things to measure, avoid common pitfalls, and learn to rely on A/B tests as the best way to measure a new system or feature? Building a culture of experimentation and the right tools to support it is just as important as the statistics behind the comparisons themselves - and potentially much trickier to get right.

Read More

9:50 AM

9:50 AM - 30 mins

Wesley Theatre

Respecting privacy with synthetically generated "look-alike" data sets

Tim Garnsey

Respecting privacy with synthetically generated "look-alike" data sets

Tim Garnsey

Safely handling data that contains sensitive or private information about people is a multi-million dollar problem at many companies. It adds time into the data engineering process, it can cost a lot in software licenses for specialised tools, and brings a range of reputational and legal risks.

Recent advances in deep learning have prompted an interesting way to attack this problem. By fitting a certain class of model on a source data set that contains sensitive information, we can produce a generator that outputs a supply of synthetic "look alike" data. This output data will preserve many of the statistical relationships between fields as the source does, and offers mathematical guarantees around the identifiability of individuals in the source data set.

This talk will provide an overview of the approach and show how it can speed data engineering effort and reduce risk.

Read More

10:20 AM

10:20 AM - 25 mins

Morning Break

10:45 AM

10:45 AM - 30 mins

Wesley Theatre

On the quest for advanced analytics: governance and the Internet of Things

Fiona Tweedie

On the quest for advanced analytics: governance and the Internet of Things

Fiona Tweedie

Data scientists dream of crystal clear data lakes and perfectly ordered warehouses with comprehensive dictionaries, consistent formats and never a null value or encoding error to mar their analysis. The reality, however, is that the bulk of time on most data projects is spent sourcing and munging data before the exploration and analysis can begin. Governance is often presented as the solution to all data woes but all too often generates more meetings than results.

The University of Melbourne is home to 8000 staff and 48000 students across seven campuses. Both researchers and professional staff recognise that data is going to be key to understanding this complex community and supporting its members. Sensor data collected from around the campuses promises the opportunity to analyse everything from demands on public transport to the impact of weather on coffee consumption. With researchers spread across ten faculties, there is a danger that multiple projects will collect fragmented data and the real power that comes from joining multiple datasets will never be realised. Conversely, overly prescriptive policies will date quickly and hamper innovation. Is it possible to satisfy both the desire to move rapidly to take advantage of new opportunities and the need to maintain data quality?

This case study will present some of the IoT projects currently being explored at the University and examine the governance efforts that are being trialled to ensure the adoption of standards and future interoperability of devices and data.

Read More

11:15 AM

11:15 AM - 30 mins

Wesley Theatre

A Retrospective on Building and Running Atlassian’s Data Lake

Rohan Dhupelia

A Retrospective on Building and Running Atlassian’s Data Lake

Rohan Dhupelia

Atlassian’s strives to be a data driven company that builds collaborative software for teams. Two and a half years ago we launched our analytics platform (i.e. data lake) on AWS which is used by over 1500 internal users each month to gain insights and enforce decisions.

In this talk we will present a retrospective on our analytics platform covering our blessings (what went well) and our mistakes (what could have been better) as well as talking about what potential next steps we might take to further improve the platform.

This talk will cover both the technical aspect (i.e. architectural choices) as well as non-technical aspect (i.e. team organisation, our principals and mandate).

Read More

11:45 AM

11:45 AM - 30 mins

Wesley Theatre

Cyclops 2.0, Image Recognition with Super Human Ability

Agustinus Nalwan

Cyclops 2.0, Image Recognition with Super Human Ability

Agustinus Nalwan

You must have heard it a few times that AI has beaten human in image recognition. Is that true? Have you seen it yourself? I am going to demonstrate Cyclops, an image recognition we built to recognise car model far better than any human.

From here on, this talk will take you through our journey, how it's all began, why we built the early version of Cyclops and what was the outcome. Furthermore, how we used this technology to dramatically improve consumer experience and built many consumers facing products which we thought was not possible before.

I will then dive down into technical details, starting from how we built Cyclops 1.0 with Tensorflow and how we overcame the training complexity with transfer learning. However, transfer learning comes with a limitation of directional invariance in which I will show what it is and how we overcame it with our novel solution.

Next, I will show you that building a car recognition as complex as Cyclops 2.0 requires a more superior model and modification of our existing transfer learning technique. I will also take you to see problems we faced with low coverage when we are going deeper and how we solved them. I will then investigate how a distributed training can speed up the training process to make this practical.

Read More

12:15 PM

12:15 PM - 50 mins

Lunch Break

1:05 PM

1:05 PM - 45 mins

Wesley Theatre

Kids can change the world with data

Linda McIver

Kids can change the world with data

Linda McIver

From researchers perpetrating unspeakable acts with their datasets, to students trying to communicate experimental results with pie charts that don’t sum to 100, we have all seen data horror stories. But in my classes kids have worked with their communities to create real change using data and computation. We can band together to give all kids the opportunity to change the world. This is the story of kids. Of data. Of Computation. And of change. It’s the story of the Australian Data Science Education Institute.

Read More

1:50 PM

1:50 PM - 30 mins

Wesley Theatre

Trump Tweets (Fun with Deep Learning)

Sachin Abeywardana

Trump Tweets (Fun with Deep Learning)

Sachin Abeywardana

In this talk I will introduce LSTMs. This is how deep learning deals with time series data. The dataset we will be focusing on is Trump's tweets. Using this data we will make a tweet generator in which we will train how to simulate Trump style tweets one character at a time.

We will be using Keras to generate the Deep Learning model. Google colab will be used so that all attendees will be able to use a GPU.

Read More

2:20 PM

2:20 PM - 20 mins

Afternoon Break

2:40 PM

2:40 PM - 30 mins

Wesley Theatre

Protecting Australia’s red meat industry with interactive data visualisation

Grant Carlson and James Binh Tham

Protecting Australia’s red meat industry with interactive data visualisation

Grant Carlson and James Binh Tham

Anyone who has seen a computer simulation of the spread of a pandemic would appreciate the power of visualisation to manage the situation.

In Australia there exists the National Livestock Identification System (NLIS), which is used to trace the movements of cattle in the event of a serious disease outbreak.

Using Neo4j we will demonstrate a tracing exercise across a dataset of over 400 million movement events showing the power of interactive data visualisation.

Read More

3:10 PM

3:10 PM - 30 mins

Wesley Theatre

DevOps 2.0: Evidence-based evolution of serverless architecture through automatic evaluation of “infrastructure as code” deployments

Aidan O'Brien

DevOps 2.0: Evidence-based evolution of serverless architecture through automatic evaluation of “infrastructure as code” deployments

Aidan O'Brien

The scientific approach teaches us to formulate hypotheses and test them experimentally in order to advance systematically. DevOps and software architecture in particular, do not traditionally follow this approach. Here decisions like “scaling up to more machines or simply employing a batch queue” or “using Apache Spark or sticking to a job scheduler across multiple machines” are worked out theoretically rather than implemented and tested objectively. Furthermore, the paucity of knowledge in unestablished systems like serverless cloud architecture hampers the theoretical approach.

We therefore partnered with James Lewis and Kief Morris to establish a fundamentally different approach for serverless architecture design that is based on scientific principles. For this, the serverless architecture stack needs to firstly be fully defined through code/text, e.g. AWS CloudFormation, so that it can easily and consistently be deployed. This “architecture as text”-base can then be modified and re-deployed to systematically test hypotheses, e.g. is an algorithm faster or a particular autoscaling group more efficient. The second key element to this novel way of evolving architecture is the automatic evaluation of any newly deployed architecture without manually recording runtime or defining interactions between services, e.g. Epsagon’s monitoring solution.

Here we describe the two key aspects in detail and showcase the benefits by describing how we improved runtime by 80% for the bioinformatics software framework GT-Scan, which is used by Australia’s premier research organization to conduct medical research.

Read More

3:40 PM

3:40 PM - 30 mins

Wesley Theatre

Privacy Preserved Data Augmentation using Enterprise Data Fabric

Atif Rahman

Privacy Preserved Data Augmentation using Enterprise Data Fabric

Atif Rahman

Enterprises hold data that has potential value outside their own firewalls. We have been trying to figure out how to share such data at a level of detail with others in a secure, safe, legal and risk mitigated manner that ensure high level of privacy while adding tangible economic and social value. Enterprises are facing numerous roadblocks, failed projects, inadequate business cases, and issues of scale that needs newer techniques, technology and approach.
In this talk, we will be setup the groundwork for scalable data augmentation for organisations and visualising technical architectures and solutions around emerging technologies of data fabrics, edge computing and a second coming of data virtualisation.
A self-assessment toolkit will be shared for people interested to apply it to their organisations.
Read More

4:10 PM

4:10 PM - 20 mins

Afternoon Break

4:30 PM

4:30 PM - 30 mins

Wesley Theatre

Using Social Media Data to Interactively Explore the Personality of a Neighbourhood

Gala Camacho

Using Social Media Data to Interactively Explore the Personality of a Neighbourhood

Gala Camacho

Why do people choose to live in one neighbourhood over another? Every day government makes decisions that can change the way a neighbourhood operates and feels. Understanding the impact that these decisions have is convoluted and hard to measure.

In April, the Gold Coast held the 2018 Commonwealth Games. These events, usually advertised as urban renewal or regeneration projects, have a lasting impact on the neighbourhoods where they take place. Usually, there is a strong push to predic the impact of the games through economic assessments and surveys, however, once the games are happening, is that impact tracked? Sometimes, further economic assessments are produced years after the event which evaluate whether the impact was as expected. Can we do better?

Social data can provide us with more timely evidence of whethere the event is activating the economy as expected, or maybe highlight issues that can be resolved.

Using social media data we will explore Facebook places data during and after the games, ultimately generating a dashboard to help us visualise change by giving us the ability to filter and aggregate the data. We will manage the data using Python’s Pandas, generate quick visualisations using Plotly, and finish off by spinning up a dashboard using Plotly’s Dash.

Read More

5:00 PM

5:00 PM - 30 mins

Wesley Theatre

Know Your Neighbours: Machine Learning on Graphs

Andrew Docherty

Know Your Neighbours: Machine Learning on Graphs

Andrew Docherty

Machine learning has become ubiquitous in many applications. There are many accessible tools available to apply standard machine learning models to make predictions on data.

Typically, machine learning problems aim to predict something about an entity using some data about each entity. However, in the real world, entities - people, places or things - are connected to each other and can have complex interactions with their neighbours. We can use this information to improve our predictions, and to gain more insight into the network structure and how entities can affect each other.

This talk will introduce machine learning on graphs, give some examples where graph approaches give large improvements over standard machine learning techniques, and will demonstrate some tools that make graph machine learning more approachable.

Read More

5:30 PM

5:30 PM - 30 mins

Wesley Theatre

Is the 370 the worst bus in Sydney?

Katie Bell

Is the 370 the worst bus in Sydney?

Katie Bell

In Switzerland, people will be surprised at a bus that's 2min late. In Sydney, people will only consider it noteworthy if a bus is more than 20min late, and this varies greatly between routes and providers. So, how do Sydney bus routes stack up? And if we're talking about privatisation, how do the private bus providers stack up against the state busses?
To answer these questions we need data… lots of data. Hooray for open government data! Transport for NSW publishes real-time information on the location and lateness of all public transport. Unfortunately it's ephemeral – there is no public log of historical lateness for us to analyse. To gather the data I needed I had to fetch, log and aggregate ephemeral real-time data that was never intended to be used this way. There are random gaps and spontaneous route or timetable changes for special events, roadworks or holidays. Even with noisy data, the patterns start to emerge across months and we can start to answer some questions. The 370 bus route is one of the most complained about routes in Sydney, it even has it's own Facebook group of ironic fans... but is it really the worst bus? Let's look at the data.
Read More

8:00 AM

8:00 AM - 960 mins

Room 1

Deep Learning Workshop

Noon van der Silk and Noon van der Silk

Deep Learning Workshop

Noon van der Silk and Noon van der Silk

Venture into deep learning with this 2-day workshop that will take you from the mathematical and theoretical foundations to building models and neural networks in TensorFlow. You will apply as you learn, working on exercises throughout the workshop. To enhance learning, a second day is dedicated to applying your new skills in team project work.

This hands-on workshop is ideal for both data science and programming professionals, who are interested in learning the basics of deep learning and embarking on their first project.

Read More

8:00 AM

8:00 AM - 960 mins

Room 1

Deep Learning Workshop...continued

Noon van der Silk and Noon van der Silk

Deep Learning Workshop...continued

Noon van der Silk and Noon van der Silk

Venture into deep learning with this 2-day workshop that will take you from the mathematical and theoretical foundations to building models and neural networks in TensorFlow. You will apply as you learn, working on exercises throughout the workshop. To enhance learning, a second day is dedicated to applying your new skills in team project work.

This hands-on workshop is ideal for both data science and programming professionals, who are interested in learning the basics of deep learning and embarking on their first project.

Read More
Back to Top