Introduction to Apache Amaterasu (Incubating): A CD Framework for your Big Data Pipelines
YOW! Data 2017
In the last few years, the DevOps movement has introduced ground breaking approaches to the way we manage the lifecycle of software development and deployment. Today organisations aspire to fully automate the deployment of microservices and web applications with tools such as Chef, Puppet and Ansible. However, the deployment of data-processing pipelines remains a relic from the dark-ages of software development.
Processing large-scale data pipelines is the main engineering task of the Big Data era, and it should be treated with the same respect and craftsmanship as any other piece of software. That is why we created Apache Amaterasu (Incubating) - an open source framework that takes care of the specific needs of Big Data applications in the world of continuous delivery.
In this session, we will take a close look at Apache Amaterasu (Incubating) a simple and powerful framework to build and dispense pipelines. Amaterasu aims to help data engineers and data scientists to compose, configure, test, package, deploy and execute data pipelines written using multiple tools, languages and frameworks.
We will see what Amaterasu provides today, and how it can help existing Big Data application and demo some of the new bits that are coming in the near future.
I'm a software developer, a speaker, an author and all around nerd. Currently working as a Principal Consultant at UXC Professional Solutions in the Information Management Practice. I've been developing software as a hobby from a young age, and professionally since 1997. I'm the co-author of the "Developing Windows Azure and Web Services" Microsoft official course (20487), Pro Couchbase Server, and have contributed to official Microsoft training materials on Windows Azure, Windows 2008 R2 HPC Server and HDInsight (Hadoop) over the years.