airflow on kubernetes containerizing your workflows
play

Airflow on Kubernetes: Containerizing your Workflows By Michael - PowerPoint PPT Presentation

Airflow on Kubernetes: Containerizing your Workflows By Michael Hewitt Agenda Kubernetes Overview 1 Airflows integration with Kubernetes 2 Deployment of Airflow on Kubernetes 3 Kubernetes Pod Operator and its benefits 4 DAG Development


  1. Airflow on Kubernetes: Containerizing your Workflows By Michael Hewitt

  2. Agenda Kubernetes Overview 1 Airflows integration with Kubernetes 2 Deployment of Airflow on Kubernetes 3 Kubernetes Pod Operator and its benefits 4 DAG Development Transformations 5 The Future of Airflow on Kubernetes 6

  3. Kubernetes Scalable Extensible Supports configuration to schedule Horizontally scaling infrastructure ● ● containers on certain types nodes Automated scaling of containers ● automatically based on system level metrics Supports the use of multiple Manual scaling of containers ● ● schedulers at the same time Components that keep track of ● Dynamic Webhook application replicas, scale in and ● out as needed Highly Available Usability Easily integrate health checks Supports both declarative and ● ● Self healing containers imperative configuration ● Native load balancers to Supports APIs for a plethora of ● ● automatically divert container languages traffic Usable executor for other ● Automated scaling based on L7 platforms (Airflow, Gitlab) ● metrics

  4. The Pod ● A Pod is the basic execution unit of a Kubernetes application ● Abstraction of a container or group of containers representing a process ● Easily expose the containers within pods ● Each pod has its own network namespace making containers within the same pod reachable by localhost ● Supports both ephemeral storage and persistent storage that can easily be shared between pods/containers

  5. Kubernetes Executor K8 Cluster Pod Airflow Worker Pod Pod Pod API Airflow Airflow Scheduler Server Worker Pod Airflow Worker

  6. Kubernetes Executor Benefits Dynamic amount of workers unlike other executors Avoids wasted resources Fault tolerance as tasks are now isolated in pods Reduced stress on Airflow Scheduler due to edge-driven triggers in K8S Watch API

  7. Deploy Airflow with Helm Non Prod Pod Pod Pod ● Package manager for Kubernetes Scheduler Web Server Database ● Deploy and manage multiple manifests as one unit ● Golang templating language to Prod templatize manifests ● Automate deployment of Airflow Pod Pod with Helm using Terraform Scheduler Web Server Database

  8. Kubernetes Pod Operator Pod Pod Pod Airflow Airflow Python Scheduler Worker Container

  9. Take Control with Kubernetes Taints, Tolerations, Development Portability Node Affinities Easily Sider car expose containers task for logs interfaces Easily track Persistent task system data volumes level metrics Pod security Perpetual task policies environments

  10. Executor Config

  11. Adapting DAG Development ● Airflow configuration with Kubernetes ● Kubernetes RBAC ● IAM roles/policies ● Automate with Terraform ○ K8S resources ○ IAM role/policies ○ Pod Networking policies ○ Datadog dashboard for alerts and metrics ● Template environments with CI/CD

  12. Taints, Tolerations, and Node Affinities Configuration Pod Configuration ... ... Kubernetes Node Python Configuration Configuration Pod ... … Kubernetes Node Toleration: foo=bar Taint: foo=bar Spark NodeAffinity: Label: foo=bar foo=bar

  13. Abstracting Kubernetes through Webhooks ● Some K8S concepts have sharp learning curves ● SREs typically manage the Kubernetes clusters ● Dynamic Webhook ○ Validating Webhooks enable an extra validation on K8S API calls ○ Mutating Webhook enable the automatic addition of properties on K8S resource creation ● Developer apply labels(simple concept) mutating webhook applies toleration and Affinities ● Force teams to label pods with team name, cost center, etc., with validating webhooks

  14. What’s Next: Airflow 2.0 ● Directly apply pod manifests in Kubernetes Pod Operator ● Kubernetes Spark Operator ● New Official Airflow Docker Image ● New Official Airflow Helm Chart

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend