Building and Testing a new Apache Airflow Plugin

Recently, I had the opportunity to add a new EMR on EKS plugin to Apache Airflow. While I’ve been a consumer of Airflow over the years, I’ve never contributed directly to the project. And weighing in at over half a million lines of code, Airflow is a pretty complex project to wade into. So here’s a guide on how I made a new operator in the AWS provider package. Overview Before you get started, it’s good to have an understanding of the different components of an Airflow task....

 · 8 min
Example output of Air Quality Data

Build your own Air Quality Monitor with OpenAQ and EMR on EKS

Fire season is closely approaching and as somebody that spent two weeks last year hunkered down inside with my browser glued to various air quality sites, I wanted to show how to use data from OpenAQ to build your own air quality analysis. With Amazon EMR on EKS, you can now customize and package your own Apache Spark dependencies and I use that functionality for this post. Overview OpenAQ maintains a publicly accessible dataset of various air quality metrics that’s updated every half hour....

 · 11 min