Presto Summit NYC 2019 December 11, 2019 Slack handles: @cheolsoo; - - PowerPoint PPT Presentation

presto summit nyc 2019
SMART_READER_LITE
LIVE PREVIEW

Presto Summit NYC 2019 December 11, 2019 Slack handles: @cheolsoo; - - PowerPoint PPT Presentation

Presto Summit NYC 2019 December 11, 2019 Slack handles: @cheolsoo; @abhonsule slack-corp.com Mission Make peoples working lives simpler, more pleasant and more productive. Slack Data Engineering at Slack Custodian of all data generated


slide-1
SLIDE 1

Presto Summit NYC 2019

December 11, 2019

Slack handles: @cheolsoo; @abhonsule slack-corp.com

slide-2
SLIDE 2
slide-3
SLIDE 3

Mission

Make people’s working lives simpler, more pleasant and more productive.

slide-4
SLIDE 4

Slack

slide-5
SLIDE 5

215B +270M 700B 250B Logs Daily Messages Daily Records Messages Table

Data Engineering at Slack

Custodian of all data generated within Slack, the product. We provide the infrastructure and tooling necessary for stakeholders to reliably access product data for user facing features, product and business insights.

slide-6
SLIDE 6

Databooks AB Testing framework BI portal

Presto

Airflow Analytics .ts Sqooper

Slack’s AB testing/ Experiments framework Tool used by Analysts, Data scientists, Marketing, Sales, Finance BI tool used by Corp/ Biztech Batch ingestion system Slack’s internal analytics portal - Product Managers, Engineers, Analysts, Data scientists, Sales, Marketing, Finance DAGs running on ETL scheduling system

Presto at Slack

clog queries

Query client logs

slide-7
SLIDE 7

Presto at Slack

Past Present Future

Presto on EMR Single cluster Starburst on EC2 Multiple clusters Federated clusters

slide-8
SLIDE 8

Query success rate

slide-9
SLIDE 9

Query count

slide-10
SLIDE 10

Multiple clusters

  • Static load

balancing

  • Per cluster config

properties

  • Per cluster

capacity planning

slide-11
SLIDE 11

Shadow clusters

  • Read-only shadow

cluster in parallel

  • Useful for testing

config changes or version upgrades

slide-12
SLIDE 12

Terraform module

  • Provision a cluster

with 25-lines of code

  • ASG optionally

with spot

  • Dedicated HMS

per cluster

slide-13
SLIDE 13

Resource groups

  • Per cluster resource

groups config

  • Per group

scheduling policies config

  • Fair (ad-hoc) vs

weighted_fair (etl)

  • Per cluster

resource groups

  • Per group

scheduling policies

  • Fair (ad-hoc) vs

weighted_fair (etl)

slide-14
SLIDE 14

JMX exporter

  • javaagent:/usr/local/jmx_exporter/jmx_exporter.jar=

7071:/usr/local/jmx_exporter/exporter.yml JVM self.consul_job( 'presto', datacenters=[env + '-us-east-1-dw1'], services=['presto'] ) Prometheus

slide-15
SLIDE 15

Grafana dashboard

slide-16
SLIDE 16

Autoscaling

curl -XPUT localhost:8889/v1/info/state -d "SHUTTING_DOWN" -H "Content-type: application/json" Graceful decommission "auto_scaling_group": { "prepare_for_termination_cmd": "<cmd>" } Chef role

slide-17
SLIDE 17

Federated clusters

  • Dynamic load

balancing

  • High availability
  • Minimize the

impact of rogue queries

slide-18
SLIDE 18

Q&A

Slack handles: @cheolsoo; @abhonsule slack-corp.com