SLIDE 1 Building dev tools at the right level of abstraction
Ben Davis CTO
@BenCDavis · ben@gatherdata.co
SLIDE 2
The data engineering industry is very fragmented.
SLIDE 3
Gather is a data integration tool for developers. It makes it really easy to build integration pipelines that push and pull data from various SaaS APIs.
SLIDE 4 Gather is a data integration tool for developers. It makes it really easy to build integration pipelines that push and pull data from various SaaS APIs.
Your App
SLIDE 5
But how did we get here?
SLIDE 6 Initial motivating problem: building data pipelines is
SLIDE 7
Data pipelining is conceptual. It breaks down into many use-cases.
SLIDE 8 Data pipelining is conceptual. It breaks down into many use-cases.
Batch Streaming ETL …
SLIDE 9
People will ask “Can I use it for this?” or “Oh I can you like this right?"
SLIDE 10
No framework to answer those questions
SLIDE 11 –K.K Aggarwal
“Abstraction is amplification of the essential and elimination of the irrelevant.”
SLIDE 12
Building companies and products requires choosing a set of abstractions
SLIDE 13
The question is what use-cases are you abstracting away in your product? How many are there?
SLIDE 14 Inspired by Cheng Lou (Facebook)
SLIDE 15 EC2
Inspired by Cheng Lou (Facebook)
SLIDE 16 EC2 HEROKU
Inspired by Cheng Lou (Facebook)
SLIDE 17 Concrete use-case
Abstraction level
Multiple use-cases No man's land No man's land
SLIDE 18 DATA PIPELINES DATA COLLECTION AND INTEGRATION BATCH PROCESSING SAAS & API INTEGRATION CUSTOMER SERVICE SYNC PAYMENT DATA ETC RECSYS REPORTS DATA PREP CLICKSTREAM IOT SENSORS
SLIDE 19
- No deployment from user
- Not writing api adapters and glue code
- Off the self connectors
- Pre-built authentication
- Not writing tests and worrying about fragile code
Value prop
SLIDE 20
The product should abstract away the complexities of those specific use-cases while maintaining flexibility and expressibility
SLIDE 21
Options for the product
SLIDE 22
- UI for specific use-cases
Options for the product
SLIDE 23
- UI for specific use-cases
- Python SDK
Options for the product
SLIDE 24
- UI for specific use-cases
- Python SDK
- Kubernetes-like declarative data flow?
Options for the product
SLIDE 25
Kubernetes is the right inspiration because it operates at the same level of abstraction
SLIDE 26
SLIDE 27
SLIDE 28
- Starting at too higher level of abstraction
- Building the tree is hard
- Building a product that is misaligned with where you’ve position yourself
- n that tree
Conclusion
SLIDE 29 ben@gatherdata.co
TALK TO ME. PLEASE
THANKS FOR LISTENING