Semantics-Aware Prediction for Analytic Queries in MapReduce - - PowerPoint PPT Presentation
Semantics-Aware Prediction for Analytic Queries in MapReduce - - PowerPoint PPT Presentation
Semantics-Aware Prediction for Analytic Queries in MapReduce Environment Weikuan Yu, Zhuo Liu, Xiaoning Ding Florida State University Auburn University New Jersey Institute of Technology Background MapReduce is a popular data-centric
P2S2: S2
Background
n
MapReduce is a popular data-centric programming model.
n
Hive and Pig are popular data warehouse systems.
Ø
More than 40% of Hadoop jobs at Yahoo! were Pig programs back in 2009, more with Hive now.
Ø
In Facebook, 95% of MR jobs are generated by Hive.
n
In Hive, each SQL query is compiled and translated into a DAG (Directed Acyclic Graph) of MapReduce jobs with inner-dependencies.
Source: ICDE’10 by Facebook AGG (J1) Join (J3) Join (J2) AGG (J4)
P2S2: S3
Motivation
n Semantic gap: between Hive and Hadoop
Ø Hadoop is
un-aware of such dependency and inter-job relationship, just treating all jobs as the same.
Ø Without such awareness, it will be difficult for Hadoop to
schedule jobs that belong to a query efficiently.
n Problems:
Ø Suboptimal query response time Ø Unfairness among queries
P2S2: S4
Efficiency issue for queries with varied sizes
n In this test, QA, QB and QC are issued in sequence, where QB is a
large query.
n Under HCS, interleaved execution happens among queries’ jobs.
AGG (J1) Join (J3) Join (J2) AGG (J4) AGG (J1) Sort (J2)
QB QC
AGG (J1) Sort (J2)
QA
QC J2 QBJ2 QC J1 QBJ1 QA J2 QA J1 QB J3 QB J4
Resource Allocation
P2S2: S5
Query Delays shown by GANTT Chart
n QA arrives first with its J1 job, QB and QC afterwards with their
jobs listed accordingly.
n QA-J2 gets delayed by QB-J1. QC-J2 gets delayed by QB-J2. n Query response time can be improved if the scheduler is aware of
the query semantics, therefore the relationship among the jobs.
100 200 300 400 500 600 700
QA QB QC
Time Elapse (seconds)
J1 J2 J3 J4 J1 J1 J2 J2
Execution Map Stall Reduce Stall
P2S2: S6
Semantics-Aware Query Prediction
n Three main techniques
Ø Semantics extraction (DAG, operator type, predicates, etc.) Ø Selectivity Estimation Ø Query prediction
JobTracker
TaskTracker
Hadoop
JobListener
Semantics Extraction
TaskTracker TaskTracker
Query prediction Selectivity Estimation Execution Engine Parser Semantics Analyzer HiveQL Queries
Job & Semantics
Results
Hive
P2S2: S7
n Selectivity estimation
Ø Predict each job node’s intermediate (Med) and output (Out) data sizes
recursively along the DAG (from bottom to top).
Ø For different job types, e.g., Groupby, Join, Select, we reply on certain
formulas and offline-built histograms to estimate their selectivities.
n Logic: Selectivity estimation=>Job/query resource estimation and
time modeling =>Used for efficient query scheduling
Selectivity estimation for a query’s jobs
AGG Join Join
SORT
T1 T2 T3 Med
Out
Map Reduce In(T1)
P2S2: S8
Selectivity Estimation
n IS is used for estimating MOF size
Ø IS = DMed/DIn
n Final Selectivity (FS) is defined as:
Ø FS = |Out|*WOut/DIn
n Predicate selectivity (ratio of selected rows to input rows)
Ø Spred = |Med| /|In|
n Projection selectivity (ratio of selected cols to tuple width)
Ø Sproj = ∑ "#$%ℎ'() * +,),'-,. /01234"#$%ℎ
P2S2: S9
Intermediate Selectivity - IS
Ø For extract job such as select and order by,
IS = Spred * Sproj
Ø For join job:
IS = Spred 1 * Sproj 1 *r1+Spred 2 * Sproj 2 *(1-r1)
Ø Groupby can involve local combine: IS = Scomb * Sproj
Ø For clustered keys,
Scomb = min(1,
!.#$% ! ∗'()*#)* Spred = min(Spred, !.#$% ! )
Ø For randomly distributed keys,
Scomb= min(Spred,
!.#$% ! /N-.(/)
P2S2: S10
Final Selectivity – Output
n For extract job,
Ø For “top k” job, |Out|=min(|In|, k) Ø For “order by” job, |Out|=|In|
n For groupby job,
Ø |Out|= min(|T|*Spred, T.dxy)
n For join job,
Ø Equ-join with uniform keys:
|Out| = |T1⋈ T2| = |T1|∗|T2|∗
& '()(+&.-., +0.-.)
Ø Chained joins:
|Out| = |T1. pred1 ⋈ T2. pred2 ⋈ T3. pred3| =Spred 1∗Spred 2∗Spred 3 max(|T1|, |T2|, |T3|)
P2S2: S11
An example for selectivity estimation
n Predict jobs’ selectivity recursively in a query.
Pred
Group by n
25 24 800000 10000 10000 9600 9600 800000 768000 768000 200000
Job 1 Job 2 Job 3
|Out|= 0.96*25*10000 *1/max(25,25) |Out|= 0.96 * max(25, 10000, 800000) |Out| = min(768000,200000)
s ps n⋈s
MED2 MED1
Join
MED2 MED1
Join resl
MED1
n⋈s ⋈ps
P2S2: S12
Multivariate Time Prediction
n List of Considered Input Features
Ø Operators Ø Input Data Ø Output Data Ø Data Growth
P2S2: S13
Job Time Prediction Model
Ø Model job execution time based on selectivity estimation Ø Training on over 5647 MR jobs, about 1000 queries from TPC-
DS and TPC-H of different scales.
Ø ! is trained for extract, groupby and join jobs respectively.
P2S2: S14
Task Time Prediction Model
Ø Data size:
Ø TDIn_i and TDOut_i
Ø The predicted time for the i-th task: ETi
Ø ETi = k0 + k1 TDIn_i + k2 TDOut_i + k3 * P (1-P) TDIn_i
P2S2: S15
Scheduling with Semantics Awareness
n Semantics-Aware Resource Demand
Ø Weight Resource Demand (WRD): aggregate the demand from
all map tasks (MTi) and Reduce tasks (RTi)
Ø WRD = ∑(#$% ∗ '#% ) + ∑(($% ∗ '(%)
n Experimented with a simple greedy scheduling policy
Ø Prioritizing smallest Queries for fast turnaround
Ø Smallest WRD First (SWRD) query scheduling
P2S2: S16
Evaluation setup
n Benchmarks.
Ø Built with TPC-H, TPC-DS queries and
Terasort/Grep/Wordcount MapReduce jobs.
Ø Submitted in Poisson interval
n Metrics
Ø Accuracy of the prediction via semantics awareness Ø Efficiency: query execution time
P2S2: S17
Estimation of Job Execution time
n Accuracy
Ø On average, 13.98% error rate for the test set of jobs.
Job Time Estimation
P2S2: S18
Estimation of Task Execution
n Map Task Execution Time
Ø Join operators lead to lower accuracy
n Reduce Task Execution Time
P2S2: S19
Validation of job and query time estimation
n Predicted time accuracy for queries
Ø Error rate is 8.3% on average for 22 100G TPC-H queries.
Query Time Estimation
200 400 600 800 1000 1200 1400 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22
Query Response Time (sec) Actual Estm
P2S2: S20
Benefits of Semantics-Aware Scheduling
n Execution time of queries
Ø Compared to HFS, SWRD improves the execution of Bing and
Facebook workloads by 44% and 40%, respectively.
Ø Compared to HCS, SWRD improves by 27.4% and 72.8%,
respectively.
0" 400" 800" 1200" 1600" 2000"
Bing"" Facebook"
Execu&on)Time)(seconds))
HFS" HCS" SWRD"
P2S2: S21
Conclusion and Future Work
n Introduced cross-layer semantics extraction and percolation to
increase the semantics awareness of the Hadoop job scheduler
n Formalized the estimation of selectivity for intermediate data
and final output
n Developed a multivariate prediction model for job and task
execution time and validated the accuracy
Ø Leveraged semantics awareness for efficient query scheduling in HIVE
n Plan to pursue further integration of semantics awareness in
complex query scheduling and other data analytics systems.
P2S2: S22