StayingFIT: StayingFIT: EfficientLoadSheddingTechniquesfor - - PowerPoint PPT Presentation
StayingFIT: StayingFIT: EfficientLoadSheddingTechniquesfor - - PowerPoint PPT Presentation
StayingFIT: StayingFIT: EfficientLoadSheddingTechniquesfor EfficientLoadSheddingTechniquesfor DistributedStreamProcessing DistributedStreamProcessing NesimeTatbul Uuretintemel StanZdonik
TalkOutline TalkOutline
- ProblemIntroduction
ProblemIntroduction
- ApproachOverview
ApproachOverview
- AdvancePlanningwithanLPSolver
AdvancePlanningwithanLPSolver
- AdvancePlanningwithFIT
AdvancePlanningwithFIT
- PerformanceResults
PerformanceResults
- RelatedWork
RelatedWork
- ConclusionsandFutureWork
ConclusionsandFutureWork
VLDB 2007, Vienna 2 Nesime Tatbul, ETH Zurich
DistributedStreamProcessing DistributedStreamProcessing
TheAurora/BorealisSystem TheAurora/BorealisSystem
VLDB 2007, Vienna 3 Nesime Tatbul, ETH Zurich
BurstyWorkload BurstyWorkload
- Datacanarrivefast,inunpredictablebursts
Datacanarrivefast,inunpredictablebursts
- Example:Networktrafficdata
Example:Networktrafficdata
VLDB 2007, Vienna 4 Nesime Tatbul, ETH Zurich
Source:InternetTrafficArchive,http://ita.ee.lbl.gov/
Burstsmaycreateresourcebottlenecks: Queryprocessingslowsdown andresultsgetdelayed! Burstsmaycreateresourcebottlenecks: Burstsmaycreateresourcebottlenecks: Queryprocessingslowsdown Queryprocessingslowsdown andresultsgetdelayed andresultsgetdelayed! !
ModelsandAssumptions ModelsandAssumptions
- Wefocuson
WefocusonCPU CPU asthelimitedresource. asthelimitedresource.
- Loadsheddingisachievedbyinserting
Loadsheddingisachievedbyinsertingprobabilistic probabilistic dropoperators dropoperators intoqueryplans. intoqueryplans.
- RandomDrop[VLDB
RandomDrop[VLDB’ ’03],WindowDrop[VLDB 03],WindowDrop[VLDB’ ’06] 06]
- Approximateresultisa
Approximateresultisasubset subset oftheoriginalresult.
- ftheoriginalresult.
- Thegoalistomaximizethe
Thegoalistomaximizethetotalweightedquery totalweightedquery throughput throughput(e.g.,[Ayadetal,SIGMOD (e.g.,[Ayadetal,SIGMOD’ ’04,Amini 04,Amini etal,ICDCS etal,ICDCS’ ’06]). 06]).
- Serversarearrangedina
Serversarearrangedinatree tree= =liketopology liketopology. .
VLDB 2007, Vienna 5 Nesime Tatbul, ETH Zurich
DistributedLoadShedding DistributedLoadShedding
KeyObservation:LoadDependency KeyObservation:LoadDependency
Plan RatesatA A.load A.throughput B.load B.throughput 1,1 3 1/3,1/3 4/3 1/4,1/4 1 1,0 1 1,0 3 1/3,0 2 0,1/2 1 0,1/2 1/2 0,1/2 3 1/5,2/5 1 1/5,2/5 1 1/5,2/5
VLDB 2007, Vienna 6 Nesime Tatbul, ETH Zurich
- ptimal
forA
- ptimal
forboth feasible forboth
≤ 1 ≤ 1 maximize!
Cost = 1 Selectivity = 1.0 Cost = 2 Selectivity = 1.0 Cost = 3 Selectivity = 1.0 Cost = 1 Selectivity = 1.0 1 tuple/sec 1 tuple/sec
- 1/4 tuple/sec
1/4 tuple/sec
Servernodesmustcoordinate intheirloadsheddingdecisions toachievehigh-qualityresults. Servernodesmustcoordinate Servernodesmustcoordinate intheirloadsheddingdecisions intheirloadsheddingdecisions toachievehigh toachievehigh-
- qualityresults.
qualityresults.
Plan RatesatA A.load A.throughput B.load B.throughput Plan RatesatA A.load A.throughput B.load B.throughput 1,1 3 1/3,1/3 4/3 1/4,1/4 Plan RatesatA A.load A.throughput B.load B.throughput 1,1 3 1/3,1/3 4/3 1/4,1/4 1 1,0 1 1,0 3 1/3,0 Plan RatesatA A.load A.throughput B.load B.throughput 1,1 3 1/3,1/3 4/3 1/4,1/4 1 1,0 1 1,0 3 1/3,0 2 0,1/2 1 0,1/2 1/2 0,1/2
DistributedLoadShed DistributedLoadShedding
ding
asaLinearOptimizationProblem asaLinearOptimizationProblem
VLDB 2007, Vienna 7 Nesime Tatbul, ETH Zurich
, 1 1
: 1
j D i j j i j i j j j D j j j j j
x i N r x s c x r x s p ζ
= =
< ≤ × × × ≤ ≤ ≤ × × ×
∑ ∑
Find such that for all nodes is maximized.
- r1
rD x1 xD c1,1 s1,1 s1 sD c1,D s1,D c2,1 s2,1 c2,D s2,D cN,1 sN,1 cN,D sN,D s1
2
sD
2
s1
N
sD
N
p1 pD
1
ζ
2
ζ
N
ζ
Problemformulationfornon-linearqueryplans (i.e.,withoperatorsplitsandmerges)isinthepaper. Problemformulationfor Problemformulationfornon non-
- linearqueryplans
linearqueryplans (i.e.,withoperatorsplitsandmerges)isinthepaper. (i.e.,withoperatorsplitsandmerges)isinthepaper.
TalkOutline TalkOutline
- ProblemIntroduction
ProblemIntroduction
- ApproachOverview
ApproachOverview
- AdvancePlanningwithanLPSolver
AdvancePlanningwithanLPSolver
- AdvancePlanningwithFIT
AdvancePlanningwithFIT
- PerformanceResults
PerformanceResults
- RelatedWork
RelatedWork
- ConclusionsandFutureWork
ConclusionsandFutureWork
VLDB 2007, Vienna 8 Nesime Tatbul, ETH Zurich
ArchitecturalOverview ArchitecturalOverview
Centralizedvs.Distributed Centralizedvs.Distributed
VLDB 2007, Vienna 9 Nesime Tatbul, ETH Zurich
- AdvancePlanning
LoadMonitoring PlanSelection PlanImplementation
- AdvancePlanning
All LoadMonitoring All PlanSelection All PlanImplementation All
- AdvancePlanning
All Coordinator LoadMonitoring All Coordinator PlanSelection All Coordinator PlanImplementation All All
ArchitecturalOverview ArchitecturalOverview
CentralizedApproach CentralizedApproach
VLDB 2007, Vienna 10 Nesime Tatbul, ETH Zurich
- Statistics
S t a t i s t i c s Statistics S t a t i s t i c s Statistics S t a t i s t i c s global plan local plan local plan local plan local plan local plan local plan P l a n
- i
d Plan-id Plan-id Plan-id Plan-id Plan-id
FeasibleInputTable:(r1,..,rn,[localplan],quality) FeasibleInputTable:(r FeasibleInputTable:(r1
1,..,r
,..,rn
n,[localplan],quality)
,[localplan],quality)
ArchitecturalOverview ArchitecturalOverview
DistributedApproach DistributedApproach
VLDB 2007, Vienna 11 Nesime Tatbul, ETH Zurich
FIT FIT FIT FIT
! ! ! !
FIT FIT FIT
! !
TalkOutline TalkOutline
- ProblemIntroduction
ProblemIntroduction
- ApproachOverview
ApproachOverview
- AdvancePlanningwithanLPSolver
AdvancePlanningwithanLPSolver
- AdvancePlanningwithFIT
AdvancePlanningwithFIT
- PerformanceResults
PerformanceResults
- RelatedWork
RelatedWork
- ConclusionsandFutureWork
ConclusionsandFutureWork
VLDB 2007, Vienna 12 Nesime Tatbul, ETH Zurich
AdvancePlanningwithanLPSolver AdvancePlanningwithanLPSolver
ApproximateLoadSheddingPlans ApproximateLoadSheddingPlans
- Givenaninfeasiblepoint,the
Givenaninfeasiblepoint,the Solvergeneratesanoptimalplan. Solvergeneratesanoptimalplan.
- Wedon
Wedon’ ’twanttocalltheSolverfor twanttocalltheSolverfor eachinfeasiblepoint. eachinfeasiblepoint.
- Keyobservation:
Keyobservation:
- Quality
Qualityr
r ≤
≤ Quality Qualityq
q
- Assumean
Assumeanerrorthreshold errorthreshold“ “ε ε” ” in in quality.Givenanyinfeasiblepoints quality.Givenanyinfeasiblepoints suchthatr<s<q: suchthatr<s<q:
- If(Quality
If(Qualityq
q –
– Quality Qualityr
r)/Quality
)/Qualityq
q ≤
≤ ε ε, , thenscanusetheloadsheddingplan thenscanusetheloadsheddingplan forr, forr,withaminormodification withaminormodification. .
VLDB 2007, Vienna 13 Nesime Tatbul, ETH Zurich
r=(50, 50) q=(100, 100) s=(60, 75)
r1 r2 100 100 " "
Inputratespace Inputratespace
A A
r1 r2 100 100 " "
K I B
AdvancePlanningwithanLPSolver AdvancePlanningwithanLPSolver
QuadTree QuadTree5 5basedPlanIndex basedPlanIndex
VLDB 2007, Vienna 14 Nesime Tatbul, ETH Zurich
- Usea
UseaRegionQuadTree RegionQuadTreetodivideandindextheinputratespace. todivideandindextheinputratespace.
B C D E F G H I J K L M r=(50, 50) q=(100, 100) E G D M J L F H C
AdvancePlanningwithanLPSolver AdvancePlanningwithanLPSolver
ExploitingNon ExploitingNon5 5uniformInputWorkload uniformInputWorkload
- Infeasiblepointsmaybeobservedwithdifferent
Infeasiblepointsmaybeobservedwithdifferent probabilities,i.e.,someregionsmayhavehigher probabilities,i.e.,someregionsmayhavehigher expectedprobability. expectedprobability.
- Givenaregionwithexpectedprobabilityp,
Givenaregionwithexpectedprobabilityp, theexpectedmaximumerrorforthisregionis: theexpectedmaximumerrorforthisregionis:
- E[Error
E[Errormax
max]=p*(Quality
]=p*(Qualitymax
max –
– Quality Qualitymin
min)/Quality
)/Qualitymax
max
- Forallregions,wemustmakesurethat:
Forallregions,wemustmakesurethat:
- Total(E[Error
Total(E[Errormax
max])
])≤ ≤ ε ε
VLDB 2007, Vienna 15 Nesime Tatbul, ETH Zurich
TalkOutline TalkOutline
- ProblemIntroduction
ProblemIntroduction
- ApproachOverview
ApproachOverview
- AdvancePlanningwithanLPSolver
AdvancePlanningwithanLPSolver
- AdvancePlanningwithFIT
AdvancePlanningwithFIT
- PerformanceResults
PerformanceResults
- RelatedWork
RelatedWork
- ConclusionsandFutureWork
ConclusionsandFutureWork
VLDB 2007, Vienna 16 Nesime Tatbul, ETH Zurich
AdvancePlanningwithFIT AdvancePlanningwithFIT
FITBasics FITBasics
- WestorefeasiblepointsinFIT:
WestorefeasiblepointsinFIT:
- (r
(r1
1,r
,r2
2,[localplan],quality)
,[localplan],quality)
- FIT
FIT= =basedloadshedding: basedloadshedding:
- Givenaninfeasiblepointp,pmustbemapped
Givenaninfeasiblepointp,pmustbemapped tothe tothehighestquality highestquality feasiblepointqinFIT, feasiblepointqinFIT, suchthatq suchthatq≤ ≤ p. p.
- WestoreareducednumberofFITpointsby:
WestoreareducednumberofFITpointsby:
- exploitingthe
exploitingthe“ “ε ε” ” errortolerancethreshold,and errortolerancethreshold,and
- nlyincludingtheFITpointsthatare
- nlyincludingtheFITpointsthatare“
“close close” ” to to thefeasibilityboundary. thefeasibilityboundary.
VLDB 2007, Vienna 17 Nesime Tatbul, ETH Zurich
r1 r2 r2 r1 " "
p q
- Complementarylocalloadsheddingplansmaybeneededfor
Complementarylocalloadsheddingplansmaybeneededfor nodeswith nodeswithoperatorsplits
- peratorsplits.
.
- Example:
Example:
- Localplansarenotpropagatedupstream.
Localplansarenotpropagatedupstream.
AdvancePlanningwithFIT AdvancePlanningwithFIT
ComplementaryLocalPlans ComplementaryLocalPlans
VLDB 2007, Vienna 18 Nesime Tatbul, ETH Zurich
r
1 2 5 Shedherefirst! Shedherefirst! Shedherefirst!
H E
AdvancePlanningwithFIT AdvancePlanningwithFIT
QuadTree QuadTree5 5basedPlanIndex basedPlanIndex
VLDB 2007, Vienna 19 Nesime Tatbul, ETH Zurich
- Usea
UseaPointQuadTree PointQuadTree todivideandindextheinputratespace. todivideandindextheinputratespace.
C D E A B G H F A C B D F G
r1 r2 !
TalkOutline TalkOutline
- ProblemIntroduction
ProblemIntroduction
- ApproachOverview
ApproachOverview
- AdvancePlanningwithanLPSolver
AdvancePlanningwithanLPSolver
- AdvancePlanningwithFIT
AdvancePlanningwithFIT
- PerformanceResults
PerformanceResults
- RelatedWork
RelatedWork
- ConclusionsandFutureWork
ConclusionsandFutureWork
VLDB 2007, Vienna 20 Nesime Tatbul, ETH Zurich
ExperimentalSetup ExperimentalSetup
VLDB 2007, Vienna 21 Nesime Tatbul, ETH Zurich
- ImplementedonBorealis
ImplementedonBorealis
- QuerynetworkswithDelay(cost,selectivity)operators
QuerynetworkswithDelay(cost,selectivity)operators
- Twoinputworkloads:
Twoinputworkloads:
- Synthetic:Exponentialdistribution
Synthetic:Exponentialdistribution
- NetworktraffictracesfromtheInternetTrafficArchive
NetworktraffictracesfromtheInternetTrafficArchive
- Goals:
Goals:
- Analyzeplangenerationefficiencyfor
Analyzeplangenerationefficiencyfor
- Solver,Solver
Solver,Solver= =W,C W,C= =FIT FIT
- Analyzecommunicationoverheadfor
Analyzecommunicationoverheadfor
- D
D= =FIT FIT
PlanGenerationPerformance PlanGenerationPerformance
Solvervs.C Solvervs.C5 5FIT FIT
VLDB 2007, Vienna 22 Nesime Tatbul, ETH Zurich
2x2querynetworkswithdifferentcostdistributions (MaximumErrorε =0.05)
PlanGenerationPerformance PlanGenerationPerformance
Solvervs.Solver Solvervs.Solver5 5W W
VLDB 2007, Vienna 23 Nesime Tatbul, ETH Zurich
2x2querynetworkswithdifferentcostdistributions (ExpectedMaximumErrorE[ε]~0.015)
D D5 5FITCommunicationOverhead FITCommunicationOverhead
EffectofQueryCostandErrorTolerance EffectofQueryCostandErrorTolerance
VLDB 2007, Vienna 24 Nesime Tatbul, ETH Zurich
2x2querynetworkswithdifferentquerycosts
D D5 5FITCommunicationOverhead FITCommunicationOverhead
SensitivitytoSelectivityChange SensitivitytoSelectivityChange
VLDB 2007, Vienna 25 Nesime Tatbul, ETH Zurich
2x2querynetworkswithdifferentoperatorselectivity (Initialselectivity=1.0)
TalkOutline TalkOutline
- ProblemIntroduction
ProblemIntroduction
- ApproachOverview
ApproachOverview
- AdvancePlanningwithanLPSolver
AdvancePlanningwithanLPSolver
- AdvancePlanningwithFIT
AdvancePlanningwithFIT
- PerformanceResults
PerformanceResults
- RelatedWork
RelatedWork
- ConclusionsandFutureWork
ConclusionsandFutureWork
VLDB 2007, Vienna 26 Nesime Tatbul, ETH Zurich
RelatedWork RelatedWork
- Loadsheddingforthesingleservercase.Examples:
Loadsheddingforthesingleservercase.Examples:
[Tatbuletal,VLDB [Tatbuletal,VLDB’ ’03/VLDB 03/VLDB’ ’06],[Babcocketal,ICDE 06],[Babcocketal,ICDE’ ’04] 04] [Ayadetal,SIGMOD [Ayadetal,SIGMOD’ ’04],[Reissetal,ICDE 04],[Reissetal,ICDE’ ’05], 05],… …
- Control
Control= =basedloadmanagementinSystemS basedloadmanagementinSystemS
[Aminietal,ICDCS [Aminietal,ICDCS’ ’06] 06]
- AggregatecongestioncontrolagainstDoSattacks
AggregatecongestioncontrolagainstDoSattacks
[Mahajanetal,SIGCOMMCCR [Mahajanetal,SIGCOMMCCR’ ’02] 02]
- Parametricqueryoptimization
Parametricqueryoptimization
[Ioannidiesetal,VLDB [Ioannidiesetal,VLDB’ ’92],[Ganguly,VLDB 92],[Ganguly,VLDB’ ’98],[Hulgeri 98],[Hulgeri etal,VLDB etal,VLDB’ ’02] 02]
VLDB 2007, Vienna 27 Nesime Tatbul, ETH Zurich
Conclusions Conclusions
- Distributedloadsheddingrequirescoordination
Distributedloadsheddingrequirescoordination amongtheservers. amongtheservers.
- Weprovidecentralizedanddistributedalternatives.
Weprovidecentralizedanddistributedalternatives.
- Weproposeefficienttechniquesforadvance
Weproposeefficienttechniquesforadvance generationofloadsheddingplans: generationofloadsheddingplans:
- Approximateloadsheddingplans
Approximateloadsheddingplans
- QuadTree
QuadTree= =basedplanindexing basedplanindexing
- Exploitinginputworkloaddistribution
Exploitinginputworkloaddistribution
- DistributedFITisbetterfordynamicenvironments.
DistributedFITisbetterfordynamicenvironments.
VLDB 2007, Vienna 28 Nesime Tatbul, ETH Zurich
FutureWork FutureWork
- Performanceonlargerscalenetworks
Performanceonlargerscalenetworks
- Bandwidthbottlenecks
Bandwidthbottlenecks
- Non
Non= =treeservertopologies treeservertopologies
- Hybridapproaches
Hybridapproaches
- Centralized+Distributed
Centralized+Distributed
- Localplanrefinement
Localplanrefinement
- Otherqualitymetrics
Otherqualitymetrics
VLDB 2007, Vienna 29 Nesime Tatbul, ETH Zurich
http://www.cs.brown.edu/research/borealis/ http://www.cs.brown.edu/research/borealis/ http://www.inf.ethz.ch/~tatbul http://www.inf.ethz.ch/~tatbul
Questions? Questions?
VLDB 2007, Vienna 30 Nesime Tatbul, ETH Zurich
Moreinformation: Moreinformation:
AdvancePlanningwithFIT AdvancePlanningwithFIT
UpstreamFITPropagation UpstreamFITPropagation
- EachleafnodegeneratesitsFITfromscratch,and
EachleafnodegeneratesitsFITfromscratch,and propagatesittoitsupstreamparent. propagatesittoitsupstreamparent.
- Eachnon
Eachnon= =leafnode,uponreceivingFITsfromits leafnode,uponreceivingFITsfromits children: children:
1. 1. MapstheFITratesfromitsoutputstoitsowninputs MapstheFITratesfromitsoutputstoitsowninputs ( (Note: Note: Mappingacrosssplitsmayresultinlocalplans). Mappingacrosssplitsmayresultinlocalplans). 2. 2. MergesmultipleFITsintoasingleFIT. MergesmultipleFITsintoasingleFIT. 3. 3. RemovestheFITentriesthatareinfeasibleforitself. RemovestheFITentriesthatareinfeasibleforitself. 4. 4. PropagatestheresultingFITfurtherupstream. PropagatestheresultingFITfurtherupstream.
VLDB 2007, Vienna 31 Nesime Tatbul, ETH Zurich
PlanGenerationPerformance PlanGenerationPerformance
EffectofInputDimensionality(C EffectofInputDimensionality(C5 5FIT) FIT)
VLDB 2007, Vienna 32 Nesime Tatbul, ETH Zurich