Pipelines on Pipelines: Creating Agile CI/CD Workflows for Airflow DAGs
By Victor Shafran CPO at databand.ai
Pipelines on Pipelines: Creating Agile CI/CD Workflows for Airflow - - PowerPoint PPT Presentation
Pipelines on Pipelines: Creating Agile CI/CD Workflows for Airflow DAGs By Victor Shafran CPO at databand.ai About Me Founder and CPO at Databand.ai Background in Machine Learning Working with data from 2008 In my spare
By Victor Shafran CPO at databand.ai
β Proud father of 2 daughters. β Run, Hike
hours to recreate.
Ruined a weekend while discovering and fixing
CI/CD pipeline for my DAGs
Dev Staging Production
CI/CD Pipeline == End to End Automation
(applicable for ..PythonOperator, β¦)
β Spark Operator, β EmrStep Operator, β Dataproc Operator, β Databricks Operator
business logic?
We want CI/CD β running END TO END!
β Python/Java Dependencies β Resources
β Multiple Version β Custom Resources
Rendered Operator Example
No batteries included!
ci_234 ci_aef
β You want every feature in separate area, β Sometime you donβt want to start every time from scratch
prod stage ci_ab1 ci_bc ci_ab1 stage
Benefits:
Whatβs real CI/CD for data intensive DAGs Effective CI/CD for SparkOperator Data Management Layer role in CI/CD process
Automation of CI/CD: Deployment DAG is a separate lecture Dags migration from research to production and vice versa.
with Databand by Josh Benamram
Shulman