Apache Airflow: Orchestrating Hybrid Workloads in the Cloud


This article describes a hybrid approach that we use to manage a data lake by handling heterogeneous workloads with the help of Apache Airflow, Kubernetes and Apache Spark on EMR.

At inovex we use Apache Airflow as a scheduling and orchestration tool in a wide range of d