Benchmarking Elastic Cloud Big Data Serv rvices under SLA - PowerPoint PPT Presentation

Benchmarking Elastic Cloud Big Data Serv rvices under SLA Constraints Nic icola las Pog oggi, Victor Cuevas-Vicenttín, David Carrera, Josep Lluis Berral, Thomas Fenech, Gonzalo Gomez, Davide Brini, Alejandro Montero Umar Farooq Minhas, Jose A. Blakeley, Donald Kossmann, Raghu Ramakrishnan and Clemens Szyperski. TPCTC - August 2019

Outline 1. Intro to TPCx-BB 4. Experimental evaluation a. Limitations for cloud systems a. Elasticity Test b. Contributions b. Load, Power, Throughput tests 2. Realistic workload generation c. Metric evaluation 5. Conclusions a. Production datasets b. Job arrival rates a. Future directions 3. Elasticity Test a. Current metric b. SLA-based addition 2

Benchmarking and TPCx-BB • Benchmarks capture the solution to a problem and guide decisions. • Widely used in development, configuration, and testing. • TPCx-BB (BigBench) is the first standardized big data benchmark • Collaboration between industry and academia Retailer data model • Follows the retailer model of TPC-DS • Adds: • Semi and unstructured data • SQL, UDF, ML, and NLP queries

TPCx-BB benchmark workflow • Similar to previous TPC database benchmarks: Load data • Load Test ( T LD ): • Generates the DB DB @ SF • imports raw data, metastore, stats, columnar • Power Test ( T PT ) • Runs queries sequentially Seq q1 … q30 • Throughput Test ( T TT ) • Runs queries concurrently q15 q21 … q16 User1 • Includes a data refresh stage q12 q18 … q2 User2 • Produces a final performance metric … UserN • BB queries per minute Metric

Limitations of the cocurrency test Dr Drawback 1: 1: Dr Drawback 2: 2: • Constant concurrency workloads • Does not consider QoS (is (isola lation) at the same scale • Query time degradation is not obvious from the final metric • We found poor scalability under q15 q21 … q16 concurrency in BB [1] Stream1 q12 q18 … q2 Stream2 q16 q30 … q19 Q4 from 10 to 100GB Stream3 over 15X slower … [1] Characterizing BigBench queries, Hive, and Spark in multi-cloud environments TPCTC'17

Proposal and contributions 1. Build a realistic big data work rklo load generator • Based on production workloads 2. Measure QoS in the form of per per-query ry SL SLAs • Apply the results in a new metric • With minimal parameters 3. 3. Ext xtend TP TPCx Cx-BB BB with a new concurrency test and metric • Implement a driver and evaluate differences

Realistic workload generation

Analyzing production big data workloads • Cosmos cluster operated within Microsoft • Sample of 350,000 job submissions • Over a month of data in 2017 • Objectives: 1. Model job submission patterns Peaks 2. Workload characterization Valleys

Modeling arrival rates • Use Hidden Markov Model (HMM) to Peaks model temporal pattern in the workload • Probabilities between finite number of states Valleys • HMM allows scaling the workload

Modeling arrival rates • Use Hidden Markov Model (HMM) to Peaks model temporal pattern in the workload • Probabilities between finite number of states Valleys • HMM allows scaling the workload Fluctuations are captured by 4 states and the transitions between them

Job input data size • As no general temporal pattern found CD CDF of of th the e job’s input data size • Cumulative distribution sufficient for modeling SF • CDF used to generate random variates mapped to SF • 1, 10, 100, 1000 GB • Studied further in [2] • Findings: • 55% < 1GB • 90% < 1TB [2] Big Data Data Management Systems performance analysis using Aloja and BigBench. Master thesis

Elasticity Test

Methodology for generating workloads 1. 1. Se Set t sc scale le (max concurrent submissions) • Defaults to n • Total queries = n * total queries 2. 2. Generate model l (queries per interval) 1. Assign queries to each batch randomly • Query repetition avoided within a batch 2. Multi scale factors can be set • Include all standard smaller SF 3. 3. Defi fine granula larit ity 1. Set time between batches 2. Defaults to 60s.

Methodology for generating workloads Elas lastic icity Test se sequence 1. Se 1. Set t sc scale le (max concurrent submissions) • Defaults to n t1 q17 • Total queries = n * total queries Time intervals t2 q7 2. 2. Generate model l (queries per interval) t3 q15 q21 1. Assign queries to each batch randomly t4 q6 q9 q14 • Query repetition avoided within a batch t5 q9 q14 2. Multi scale factors can be set t6 q11 q22 q21 • Include all standard smaller SF t7 q16 q15 3. 3. Defi fine granula larit ity t8 q24 1. Set time between batches 2. Defaults to 60s. … # queries / batch

New SLA-aware benchmark metric • Query-specific SLAs • Sets a limit for query completion time • Measures • Number of misses • Distance to SLA • Currently defined ad-hoc • Uses Power Test times for the SUT(s) • Adds a 25% 25% margin tolerance • Benefits • Works on all SF and future proof

New SLA-aware benchmark metric • Query-specific SLAs • Sets a limit for query completion time • Measures • Number of misses • Distance to SLA • Currently defined ad-hoc • Uses Power Test times for the SUT(s) • Adds a 25% 25% margin tolerance • Benefits • Works on all SF and future proof Example: q1 took 38s. in isolation SLA for q1 = 47.5s.

New SLA-aware benchmark metric Elas lastic icity Test se sequence • Query-specific SLAs on concurrency t1 q17 • Sets a limit for query completion time t2 q7 • Measures • Number of misses t3 q15 q21 • Distance to SLA t4 Time q6 q9 q14 • Indirectly isolation and dependencies t5 q9 q14 • Currently defined ad-hoc • Uses Power Test times for the SUT(s) t6 q11 q22 q21 • Adds a 25% 25% margin tolerance t7 q16 q15 • Benefits t8 q24 • Works on all SF and future proof to tech. … SLA distance Example: q1 took 38s. in isolation # queries / batch time SLA for q1 = 47.5s.

Current TPCx-BB performance metric

Current TPCx-BB performance metric Scale factor Total number of queries

Current TPCx-BB performance metric

New SLA-aware benchmark metric BB ++

New SLA-aware benchmark metric BB ++ Interval between each batch of queries

New SLA-aware benchmark metric BB ++ SLA distance

New SLA-aware benchmark metric BB ++ SLA factor

New SLA-aware benchmark metric BB ++ Total execution time of the elasticity test

SLA distance • Distance between the actual execution time and the specified SLA

SLA distance • Distance between the actual execution time and the specified SLA Queries that complete within their SLA do not contribute to the sum

SLA distance • Distance between the actual execution time and the specified SLA

SLA factor < 1 when less tan 25% of the queries fail their SLA, > 1 if more of 25% of the queries fail their SLA

SLA factor < 1 when less tan 25% of the queries fail their SLA, > 1 if more of 25% of the queries fail their SLA Number of queries that fail to meet their SLA

SLA factor < 1 when less tan 25% of the queries fail their SLA, > 1 if more of 25% of the queries fail their SLA

Experimental evaluation

Experimental evaluation • Experiments performed on Apache Hiv ive (2.2/2.3) and Sp Spark rk (2.1/2.2) • Benchmark runs limited to the 14 SQ SQL querie ies of TPCx-BB • Due to errors and scalability limitations • Using a fixed scale factor • Total 512 512-cores and 2TB TB of RAM • 32 workers: 16 vcpus and 64GB RAM • Ran on 3 majo jor clo loud provi viders using block storage • Results an anonymized • (Only results for Provid ider1 at t 10 10TB presented)

Ela lasticity Test at 10TB and 2 streams Provid ider A: : Hiv ive

Ela lasticity Test at 10TB and 2 streams Provid ider A: : Hiv ive Provid ider A: : Spark

Complete TPCx-BB test times at 10TB Provid ider A: : Hiv ive Provid ider B: : Spark 35,000 Total Time (s), 30,122 30,000 7,084 Total Time (s), 23,743 25,000 Time (s) 20,000 6,603 12,878 15,000 6,496 10,000 5,520 5,036 5,000 5,124 5,124 0 Provider A: Hive Provider A: Spark Elasticity Time (s) 7,084 6,603 Throughput Time (s) 12,878 6,496 Power Time (s) 5,036 5,520 Load time (s) 5,124 5,124 Total Time (s) 30,122 23,743 21

Comparison of the two scores at 10TB BBQpm (old BB (old) BB++ BB ++Qpm (n (new) Provider A: Spark 1,767 30% diff Provider A: Hive 1,352 Provider A: Spark 1,286 Hive gets 4.3x lower score in Metric score the new metric Spark also gets a lower Provider A: Hive 295 score 1 2 BBQpm BB++Qpm Provider A: Hive 1,352 295 Provider A: Spark 1,767 1,286 22

Summary and future directions

Benchmarking Elastic Cloud Big Data Serv rvices under SLA - PowerPoint PPT Presentation

Benchmarking Elastic Cloud Big Data Serv rvices under SLA Constraints Nic icola las Pog oggi, Victor Cuevas-Vicenttn, David Carrera, Josep Lluis Berral, Thomas Fenech, Gonzalo Gomez, Davide Brini, Alejandro Montero Umar Farooq Minhas, Jose

December 2013 Rea l esta te d ea ls serv ices Rea l esta te d ea ls serv ices Target properties

Why do big data and cloud systems slow down and stop? Shan Lu What are? Why do big data and

Safe Global Serv rvices International Travelers Training - Mexico OBJECTIVES State

Professional Advisory ry Serv rvices An Organizational Opportunity History TMI Trust Company

Tran ransport ort at ion on Serv rvices Benefit s Col ollab abor orat ive 1/8/2018

Health th, Saf Safety & Enviro ronment Se t Serv rvices at Kuwait U Univer ersit ity H

Optimization of Video Serv rvices by SDN-Assisted Edge Computing A. Murat Tekalp Department of

Air Serv rvices Overv rview Brandon Schwendeman and Linda Lazich Ohio EPA Division

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3

Big Data on Google Cloud Using Cloud Dataflow, BigQuery, and friends to process data the Cloud

Including Security Monitoring in Cloud Service Level Agreement (SLA) Amir Teshome Supervisors

National Stakeholder Team for PSEP Funding Guidance for SLA, PSEP, and University/Agency

Scaling-up SLA Monitoring in Scaling-up SLA Monitoring in Pervasive Environments Pervasive

Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud - Qingxi

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Benchmarking Lunch-n-Learn March 18, 2019 Agenda 1. Why Benchmarking? 2. Introduction to

Generation of ultra-broadband entangled photons from chirped-MgSLT crystal: towards mono-cycle

Heavy Quark Diffusion from the Lattice Viljami Leino Technische Universitt Mnchen, t30f In

A toolkit for symbolic music generation Hao-Wen Dong Ke Chen Julian McAuley Taylor

Achieving the Triple Aim in Vermont QPM Work Group Lawrence Miller, Chief of Health Care Reform,

Redshift-space distortion analysis from the DR14 eBOSS quasar sample in Fourier space Hctor

at FNAL Michael Tartaglia LARP/HiLumi Program Management Group 02 August 2017 Stand 4

Relevance Feedback Relevance Feedback Relevance Feedback Prof. Paolo Ciaccia Prof. Paolo

Lightweight Software Process Assessment and Improvement Tom Feliz tom.feliz@tektronix.com

Benchmarking Elastic Cloud Big Data Serv rvices under SLA - PowerPoint PPT Presentation

Benchmarking Elastic Cloud Big Data Serv rvices under SLA Constraints Nic icola las Pog oggi, Victor Cuevas-Vicenttn, David Carrera, Josep Lluis Berral, Thomas Fenech, Gonzalo Gomez, Davide Brini, Alejandro Montero Umar Farooq Minhas, Jose

December 2013 Rea l esta te d ea ls serv ices Rea l esta te d ea ls serv ices Target properties

Why do big data and cloud systems slow down and stop? Shan Lu What are? Why do big data and

Safe Global Serv rvices International Travelers Training - Mexico OBJECTIVES State

Professional Advisory ry Serv rvices An Organizational Opportunity History TMI Trust Company

Tran ransport ort at ion on Serv rvices Benefit s Col ollab abor orat ive 1/8/2018

Health th, Saf Safety &amp; Enviro ronment Se t Serv rvices at Kuwait U Univer ersit ity H

Optimization of Video Serv rvices by SDN-Assisted Edge Computing A. Murat Tekalp Department of

Air Serv rvices Overv rview Brandon Schwendeman and Linda Lazich Ohio EPA Division

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3

Big Data on Google Cloud Using Cloud Dataflow, BigQuery, and friends to process data the Cloud

Including Security Monitoring in Cloud Service Level Agreement (SLA) Amir Teshome Supervisors

National Stakeholder Team for PSEP Funding Guidance for SLA, PSEP, and University/Agency

Scaling-up SLA Monitoring in Scaling-up SLA Monitoring in Pervasive Environments Pervasive

Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud - Qingxi

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Benchmarking Lunch-n-Learn March 18, 2019 Agenda 1. Why Benchmarking? 2. Introduction to

Generation of ultra-broadband entangled photons from chirped-MgSLT crystal: towards mono-cycle

Heavy Quark Diffusion from the Lattice Viljami Leino Technische Universitt Mnchen, t30f In

A toolkit for symbolic music generation Hao-Wen Dong Ke Chen Julian McAuley Taylor

Achieving the Triple Aim in Vermont QPM Work Group Lawrence Miller, Chief of Health Care Reform,

Redshift-space distortion analysis from the DR14 eBOSS quasar sample in Fourier space Hctor

at FNAL Michael Tartaglia LARP/HiLumi Program Management Group 02 August 2017 Stand 4

Relevance Feedback Relevance Feedback Relevance Feedback Prof. Paolo Ciaccia Prof. Paolo

Lightweight Software Process Assessment and Improvement Tom Feliz tom.feliz@tektronix.com

Health th, Saf Safety & Enviro ronment Se t Serv rvices at Kuwait U Univer ersit ity H