CLARINET: WAN-Aware Optimization for Analytics Queries Presented - - PowerPoint PPT Presentation

clarinet wan aware optimization for analytics queries
SMART_READER_LITE
LIVE PREVIEW

CLARINET: WAN-Aware Optimization for Analytics Queries Presented - - PowerPoint PPT Presentation

CLARINET: WAN-Aware Optimization for Analytics Queries Presented By Robert Claus Agenda 1. The Problem 2. Clarinet 3. Optimizing WAN Queries 4. Results Agenda 1. The Problem 2. Clarinet 3. Optimizing WAN Queries 4. Results Low


slide-1
SLIDE 1

CLARINET: WAN-Aware Optimization for Analytics Queries

Presented By Robert Claus

slide-2
SLIDE 2

Agenda

1. The Problem 2. Clarinet 3. Optimizing WAN Queries 4. Results

slide-3
SLIDE 3

Agenda

1. The Problem 2. Clarinet 3. Optimizing WAN Queries 4. Results

slide-4
SLIDE 4

Low Application Latency Requires Localized Servers

Servers must be close to clients for latency. Wide Area Networks (WANs) are necessary. Collecting data into a central datastore for analytics is costly and slow.

slide-5
SLIDE 5

Geode Focused On Execution

Previous work focused on executing queries smartly. Caching / Sending Deltas Choosing efficient distributed join algorithms Minimizing bandwidth rather than optimizing performance Allowing servers to adjust their sub-query execution plans

slide-6
SLIDE 6

Wide Area Networks Are Heterogeneous

Sites may have different data available. Links vary by 20x in latency. Link properties are relatively constant. Bandwidth is finite.

slide-7
SLIDE 7

Example Query Planned Sub-optimally

Hash Join Results Select Results

slide-8
SLIDE 8

Central Planning Is Necessary

Execution plans limit flexibility during execution. Need to consider the network before the execution plan.

slide-9
SLIDE 9

Agenda

1. The Problem 2. Clarinet 3. Optimizing WAN Queries 4. Results

slide-10
SLIDE 10

Clarinet Focuses on Planning

Clarinet adds network considerations into logical query plan optimization. Allows global optimization across queries. Introduces optimizations not possible at execution stage. Optimize execution time rather than resource usage.

slide-11
SLIDE 11

Combining Optimization and Scheduling

slide-12
SLIDE 12

Agenda

1. The Problem 2. Clarinet 3. Optimizing WAN Queries 4. Results

slide-13
SLIDE 13

Optimizing WAN Queries Is Hard

There are too many options to optimize in absolute terms Breaking queries into sub-queries Where each subquery will be run How each subquery will be run Network properties are a shared resource across all queries

slide-14
SLIDE 14

Heuristic Optimization Algorithm

1. Assign where tasks run first:

a. Place tasks with no dependencies (Mappers) where the data is. b. Just optimize where dependant tasks (Reducers) run based on network capacity. i. Also consider just putting all reducers on the node with the most mappers.

2. Estimate how long each DAG should take:

a. Insert “shuffle” nodes into the DAG whenever data is moved over the network. i. Network properties ii. Currently running tasks b. Calculate the total length the DAG will take using a LP.

slide-15
SLIDE 15

Example Query Planning

Hash Join Select A=1 Select A=1 Scan SS Scan WS Broadcast Join Select A=1 Scan CS

slide-16
SLIDE 16

DC2 DC1 DC3

Assign Mappers

Hash Join Select A=1 Select A=1 Scan SS Scan WS Broadcast Join Select A=1 Scan CS

slide-17
SLIDE 17

DC2 Work DC1 Work DC3 Work

Compress Compute Operators

Hash Join Broadcast Join

slide-18
SLIDE 18

DC2 Work DC1 Work DC3 Work

Compress Compute Operators

Hash Join Broadcast Join On what server do these operators take place?

slide-19
SLIDE 19

DC2 Work DC1 Work DC3 Work

Compress Compute Operators

Hash Join Broadcast Join On what server do these operators take place? 200 GB 80 Gbps 100 Gbps to DC1

  • r

40 Gbps to DC2 200 GB 80 Gbps OR

slide-20
SLIDE 20

DC2 Work DC1 Work DC3 Work

Compress Compute Operators

Hash Join Broadcast Join On what server do these operators take place? 200 GB 80 Gbps 100 Gbps to DC1

  • r

40 Gbps to DC2 200 GB 80 Gbps OR

slide-21
SLIDE 21

Shuffle Operators

Shuffle Operator Operation on Server 1 Operation on Server 2 Data on Server 1 Data on Server 2 This operation’s cost can be estimated from the volume of data and network bandwidth.

slide-22
SLIDE 22

DC2 Work DC1 Work DC3 Work

Introduce “Shuffle” Operators

Hash Join 80 Gbps Broadcast Join 100 Gbps

slide-23
SLIDE 23

DC2 Work DC1 Work DC3 Work

Compute Cost Estimate

Hash Join 80 Gbps Broadcast Join 100 Gbps 120s 60s 60s 180s 120s 120s 60s

slide-24
SLIDE 24

Dynamically Scheduling Resources

Allow scheduling tasks from any of the next k queries if resources available. Efficiently uses available resources. k must be tuned to avoid over-scheduling tasks with no dependencies. Queries selected based on relative deadline proximity.

slide-25
SLIDE 25

Agenda

1. The Problem 2. Clarinet 3. Optimizing WAN Queries 4. Results

slide-26
SLIDE 26

Running Time Improved

slide-27
SLIDE 27

Network Usage Improved

slide-28
SLIDE 28

Other Performance Features

Multi Query Optimization 60% of queries run in batches ended up with different plans. Resource Fragmentation Network links are fallow less than 3% of the time. Optimization Time Approximately 10 seconds

slide-29
SLIDE 29

Questions?