SLIDE 1 Greg Neiheisel
CTO
SLIDE 2 Astronomer
Streaming data Data pipelines Code first ETL
Data Engineering Platform
SLIDE 3
Early Priorities
Quick prototyping Get data in motion Ease of scale
SLIDE 4
Astronomer V1
Lambda + API Gateway Cloudwatch for Monitoring Kinesis + Elastic Beanstalk
SLIDE 5
Trouble in paradise
SLIDE 6
Strategic Obstacles
Companies view Amazon as direct competition Acquisition talks Open source philosophy
SLIDE 7
Engineering Obstacles
Access to customer data Need a better tool for ETL Deeply ingrained in the AWS ecosystem
SLIDE 8
Single Unified Platform
SLIDE 9
DC/OS at Astronomer
Apache Airflow & Spark on Mesos Marathon (Kubernetes?) replaces Elastic Beanstalk Foundation for open source DE platform
SLIDE 10
Apache Airflow
SLIDE 11
Airflow on Mesos
Leverage community-contributed Mesos executor Up and running quickly Scales to millions of tasks daily
SLIDE 12
Airflow at Astronomer
Behind the scenes to Managed service Intelligent Redshift loading Dependency driven tasks
SLIDE 13
Not all AWS tools are created equal
SLIDE 14
Kinesis to Kafka
SLIDE 15
Issues with Kinesis
Buggy Kinesis Client Library Not available everywhere Unable to tap into the Kafka ecosystem
SLIDE 16
The road to Kafka
Rewriting API and processors in Go Improve provisioning, monitoring and testing Run systems in parallel
SLIDE 17
Kong and the inevitable end of API Gateway
SLIDE 18
Kong
Replaces API Gateway Auth, rate limiting, lambda invocations for APIs Backed by Cassandra
SLIDE 19
CloudFormation + Ansible to Terraform
SLIDE 20
Terraform
Infrastructure as code 100% repeatable installs Ease of scale
SLIDE 21
Rebuilding CloudWatch
SLIDE 22
Prometheus
All nodes monitored out of the box Write our own exporters Ease of scale
SLIDE 23
ELK
Centralized logging Aggregated queries across instances
SLIDE 24
KairosDB
Time series events collected via REST Extremely durable, backed by Cassandra Rollups must be handled externally
SLIDE 25
R&D
Kafka Connect sources/sinks Ceph or Minio Istio, Weave, Kubernetes Druid
SLIDE 26 Astronomer.io
Greg Neiheisel Twitter: @schniebot LinkedIn: greg-neiheisel