Modernizing Data Estates with Presto Ken Seier, Chief Architect | - - PowerPoint PPT Presentation

modernizing data estates with presto
SMART_READER_LITE
LIVE PREVIEW

Modernizing Data Estates with Presto Ken Seier, Chief Architect | - - PowerPoint PPT Presentation

Modernizing Data Estates with Presto Ken Seier, Chief Architect | Data & AI ken.seier@insight.com Insight fast facts DEEP PORTFOLIO & RELATIONSHIPS ENGAGED WORKFORCE GLOBAL REACH 19 3,500 + 11,000 + countries Hardware, software


slide-1
SLIDE 1

Modernizing Data Estates with Presto

Ken Seier, Chief Architect | Data & AI ken.seier@insight.com

slide-2
SLIDE 2

Insight fast facts

1988

GLOBAL REACH DEEP PORTFOLIO & RELATIONSHIPS ENGAGED WORKFORCE BROAD EXPERTISE FINANCIAL STABILITY FOUNDED IN

3,500+ 7,500+

19 countries

$9B+ 11,000+

Insight teammates worldwide Fortune 500 company with long legacy and knowledge serving clients around the globe in revenue in 2018 Hardware, software and cloud partners Sales and service delivery professionals

slide-3
SLIDE 3

Presto today

  • Targeted query federation for

line-of-business applications or reporting

  • Ad hoc analytics enablement
  • Tech and Retail verticals, with

some FinServ

https://db-engines.com/en/ranking_trend/system/Presto

slide-4
SLIDE 4

Federated queries and data aggregation

Presto doing what we know its good at, and a little more.

slide-5
SLIDE 5

Challenge

  • Global technical services company
  • 500,000+ customers
  • 300,000+ events/second
  • End-user investigation tool with

cumbersome Java query tier

slide-6
SLIDE 6

Detail events in Amazon S3 Event aggregates in Elasticsearch Simplified Java/SQL services Custom insights UX Starburst Presto query fabric

Federated query solution

slide-7
SLIDE 7

Pre-aggregation ETL solution

Amazon Elasticsearch

slide-8
SLIDE 8

Outcomes

  • Rationalized Java query tier to single

Presto SQL source

  • Implemented pre-aggregation ETL in

same AWS/Java/Presto toolset

  • Elasticsearch queries through Presto
  • ver 1 million documents return in

<2 seconds

slide-9
SLIDE 9

Big Data 2.0

Presto is a lighter replacement for aging big SQL tools.

slide-10
SLIDE 10

Challenge

  • Global software-as-a-service

company

  • 15,000,000+ customers
  • Ad-hoc queries over 100 terabytes of

cleansed data

  • Aging on-prem big-data-SQL

implementation challenged to scale

slide-11
SLIDE 11

Hive to Presto

ANSI SQL queries Hive QL queries

Data lake Data lake

Starburst Presto

slide-12
SLIDE 12

Outcomes

  • Data-in-place replacement for Hive
  • Migrate from HiveQL to ANSI SQL
  • Many-X concurrency improvement
  • ver Hive
  • 10X performance over Spark

benchmarks

slide-13
SLIDE 13

Unified Query Plane

Using Presto to simplify and de-risk legacy data management

slide-14
SLIDE 14

Challenge

  • Global manufacturer/retailer
  • $20,000,000,000+ globally
  • Rich operational ecosystem
  • Aggressively working toward

comprehensive stack rationalization

slide-15
SLIDE 15

Legacy data estate

slide-16
SLIDE 16

Presto data fabric

slide-17
SLIDE 17

Outcomes

  • Many, many Presto sources and consumers
  • Supporting data science and line-of-business on

isolated clusters

  • Presto abstraction over legacy systems enables

table-by-table migrations

  • Row and column level RBAC enabled in Ranger
  • End-to-end automation for registering and

managing data definitions: metadata, stats and security

  • Query-grain costbacks enabled with log listener
slide-18
SLIDE 18

Trends

Where Presto may be headed

slide-19
SLIDE 19

Presto today

  • Targeted query federation for

applications or reporting

  • Ad hoc analytics enablement
  • Tech and Retail verticals, with

some FinServ

https://db-engines.com/en/ranking_trend/system/Presto

slide-20
SLIDE 20

Presto going forward

  • Adoption driven by data science value
  • Drafting with Kubernetes adoption
  • Awareness in new industries
  • Data estate rationalization
  • Blue/green migration abstraction
  • New data tier and estate patterns
slide-21
SLIDE 21

Presto going forward Core data value cases

Historical Reporting

Line of business reporting for defined historical period using defined metrics and performance indicators

Data warehouse Data mart Operational Data Store

Line of business reporting for making real-time course corrections in day to day

  • perations

Small, disk-bound or in-memory store Analytics Discovery

Line of business reporting for making real-time course corrections in day to day

  • perations

Data lake Lab environment

slide-22
SLIDE 22

Presto going forward Conceptual data architecture

ANSI SQL data & insight

Data directly from source systems Data staged from source systems Direct event and transactional data Insight enrichment

Three fully- decoupled, horizontally scalable, single-tool tiers

Data lake

  • f choice
slide-23
SLIDE 23

Questions?

Ken Seier, Chief Architect | Data & AI ken.seier@insight.com