StayingFIT: StayingFIT: EfficientLoadSheddingTechniquesfor - - PowerPoint PPT Presentation

staying fit staying fit efficient load shedding
SMART_READER_LITE
LIVE PREVIEW

StayingFIT: StayingFIT: EfficientLoadSheddingTechniquesfor - - PowerPoint PPT Presentation

StayingFIT: StayingFIT: EfficientLoadSheddingTechniquesfor EfficientLoadSheddingTechniquesfor DistributedStreamProcessing DistributedStreamProcessing NesimeTatbul Uuretintemel StanZdonik


slide-1
SLIDE 1

StayingFIT: StayingFIT: EfficientLoadSheddingTechniquesfor EfficientLoadSheddingTechniquesfor DistributedStreamProcessing DistributedStreamProcessing

NesimeTatbul UğurÇetintemel StanZdonik

slide-2
SLIDE 2

TalkOutline TalkOutline

  • ProblemIntroduction

ProblemIntroduction

  • ApproachOverview

ApproachOverview

  • AdvancePlanningwithanLPSolver

AdvancePlanningwithanLPSolver

  • AdvancePlanningwithFIT

AdvancePlanningwithFIT

  • PerformanceResults

PerformanceResults

  • RelatedWork

RelatedWork

  • ConclusionsandFutureWork

ConclusionsandFutureWork

VLDB 2007, Vienna 2 Nesime Tatbul, ETH Zurich

slide-3
SLIDE 3

DistributedStreamProcessing DistributedStreamProcessing

TheAurora/BorealisSystem TheAurora/BorealisSystem

VLDB 2007, Vienna 3 Nesime Tatbul, ETH Zurich

slide-4
SLIDE 4

BurstyWorkload BurstyWorkload

  • Datacanarrivefast,inunpredictablebursts

Datacanarrivefast,inunpredictablebursts

  • Example:Networktrafficdata

Example:Networktrafficdata

VLDB 2007, Vienna 4 Nesime Tatbul, ETH Zurich

Source:InternetTrafficArchive,http://ita.ee.lbl.gov/

Burstsmaycreateresourcebottlenecks: Queryprocessingslowsdown andresultsgetdelayed! Burstsmaycreateresourcebottlenecks: Burstsmaycreateresourcebottlenecks: Queryprocessingslowsdown Queryprocessingslowsdown andresultsgetdelayed andresultsgetdelayed! !

slide-5
SLIDE 5

ModelsandAssumptions ModelsandAssumptions

  • Wefocuson

WefocusonCPU CPU asthelimitedresource. asthelimitedresource.

  • Loadsheddingisachievedbyinserting

Loadsheddingisachievedbyinsertingprobabilistic probabilistic dropoperators dropoperators intoqueryplans. intoqueryplans.

  • RandomDrop[VLDB

RandomDrop[VLDB’ ’03],WindowDrop[VLDB 03],WindowDrop[VLDB’ ’06] 06]

  • Approximateresultisa

Approximateresultisasubset subset oftheoriginalresult.

  • ftheoriginalresult.
  • Thegoalistomaximizethe

Thegoalistomaximizethetotalweightedquery totalweightedquery throughput throughput(e.g.,[Ayadetal,SIGMOD (e.g.,[Ayadetal,SIGMOD’ ’04,Amini 04,Amini etal,ICDCS etal,ICDCS’ ’06]). 06]).

  • Serversarearrangedina

Serversarearrangedinatree tree= =liketopology liketopology. .

VLDB 2007, Vienna 5 Nesime Tatbul, ETH Zurich

slide-6
SLIDE 6

DistributedLoadShedding DistributedLoadShedding

KeyObservation:LoadDependency KeyObservation:LoadDependency

Plan RatesatA A.load A.throughput B.load B.throughput 1,1 3 1/3,1/3 4/3 1/4,1/4 1 1,0 1 1,0 3 1/3,0 2 0,1/2 1 0,1/2 1/2 0,1/2 3 1/5,2/5 1 1/5,2/5 1 1/5,2/5

VLDB 2007, Vienna 6 Nesime Tatbul, ETH Zurich

  • ptimal

forA

  • ptimal

forboth feasible forboth

≤ 1 ≤ 1 maximize!

Cost = 1 Selectivity = 1.0 Cost = 2 Selectivity = 1.0 Cost = 3 Selectivity = 1.0 Cost = 1 Selectivity = 1.0 1 tuple/sec 1 tuple/sec

  • 1/4 tuple/sec

1/4 tuple/sec

Servernodesmustcoordinate intheirloadsheddingdecisions toachievehigh-qualityresults. Servernodesmustcoordinate Servernodesmustcoordinate intheirloadsheddingdecisions intheirloadsheddingdecisions toachievehigh toachievehigh-

  • qualityresults.

qualityresults.

Plan RatesatA A.load A.throughput B.load B.throughput Plan RatesatA A.load A.throughput B.load B.throughput 1,1 3 1/3,1/3 4/3 1/4,1/4 Plan RatesatA A.load A.throughput B.load B.throughput 1,1 3 1/3,1/3 4/3 1/4,1/4 1 1,0 1 1,0 3 1/3,0 Plan RatesatA A.load A.throughput B.load B.throughput 1,1 3 1/3,1/3 4/3 1/4,1/4 1 1,0 1 1,0 3 1/3,0 2 0,1/2 1 0,1/2 1/2 0,1/2

slide-7
SLIDE 7

DistributedLoadShed DistributedLoadShedding

ding

asaLinearOptimizationProblem asaLinearOptimizationProblem

VLDB 2007, Vienna 7 Nesime Tatbul, ETH Zurich

, 1 1

: 1

j D i j j i j i j j j D j j j j j

x i N r x s c x r x s p ζ

= =

< ≤ × × × ≤ ≤ ≤ × × ×

∑ ∑

Find such that for all nodes is maximized.

  • r1

rD x1 xD c1,1 s1,1 s1 sD c1,D s1,D c2,1 s2,1 c2,D s2,D cN,1 sN,1 cN,D sN,D s1

2

sD

2

s1

N

sD

N

p1 pD

1

ζ

2

ζ

N

ζ

Problemformulationfornon-linearqueryplans (i.e.,withoperatorsplitsandmerges)isinthepaper. Problemformulationfor Problemformulationfornon non-

  • linearqueryplans

linearqueryplans (i.e.,withoperatorsplitsandmerges)isinthepaper. (i.e.,withoperatorsplitsandmerges)isinthepaper.

slide-8
SLIDE 8

TalkOutline TalkOutline

  • ProblemIntroduction

ProblemIntroduction

  • ApproachOverview

ApproachOverview

  • AdvancePlanningwithanLPSolver

AdvancePlanningwithanLPSolver

  • AdvancePlanningwithFIT

AdvancePlanningwithFIT

  • PerformanceResults

PerformanceResults

  • RelatedWork

RelatedWork

  • ConclusionsandFutureWork

ConclusionsandFutureWork

VLDB 2007, Vienna 8 Nesime Tatbul, ETH Zurich

slide-9
SLIDE 9

ArchitecturalOverview ArchitecturalOverview

Centralizedvs.Distributed Centralizedvs.Distributed

VLDB 2007, Vienna 9 Nesime Tatbul, ETH Zurich

  • AdvancePlanning

LoadMonitoring PlanSelection PlanImplementation

  • AdvancePlanning

All LoadMonitoring All PlanSelection All PlanImplementation All

  • AdvancePlanning

All Coordinator LoadMonitoring All Coordinator PlanSelection All Coordinator PlanImplementation All All

slide-10
SLIDE 10

ArchitecturalOverview ArchitecturalOverview

CentralizedApproach CentralizedApproach

VLDB 2007, Vienna 10 Nesime Tatbul, ETH Zurich

  • Statistics

S t a t i s t i c s Statistics S t a t i s t i c s Statistics S t a t i s t i c s global plan local plan local plan local plan local plan local plan local plan P l a n

  • i

d Plan-id Plan-id Plan-id Plan-id Plan-id

slide-11
SLIDE 11

FeasibleInputTable:(r1,..,rn,[localplan],quality) FeasibleInputTable:(r FeasibleInputTable:(r1

1,..,r

,..,rn

n,[localplan],quality)

,[localplan],quality)

ArchitecturalOverview ArchitecturalOverview

DistributedApproach DistributedApproach

VLDB 2007, Vienna 11 Nesime Tatbul, ETH Zurich

FIT FIT FIT FIT

! ! ! !

FIT FIT FIT

! !

slide-12
SLIDE 12

TalkOutline TalkOutline

  • ProblemIntroduction

ProblemIntroduction

  • ApproachOverview

ApproachOverview

  • AdvancePlanningwithanLPSolver

AdvancePlanningwithanLPSolver

  • AdvancePlanningwithFIT

AdvancePlanningwithFIT

  • PerformanceResults

PerformanceResults

  • RelatedWork

RelatedWork

  • ConclusionsandFutureWork

ConclusionsandFutureWork

VLDB 2007, Vienna 12 Nesime Tatbul, ETH Zurich

slide-13
SLIDE 13

AdvancePlanningwithanLPSolver AdvancePlanningwithanLPSolver

ApproximateLoadSheddingPlans ApproximateLoadSheddingPlans

  • Givenaninfeasiblepoint,the

Givenaninfeasiblepoint,the Solvergeneratesanoptimalplan. Solvergeneratesanoptimalplan.

  • Wedon

Wedon’ ’twanttocalltheSolverfor twanttocalltheSolverfor eachinfeasiblepoint. eachinfeasiblepoint.

  • Keyobservation:

Keyobservation:

  • Quality

Qualityr

r ≤

≤ Quality Qualityq

q

  • Assumean

Assumeanerrorthreshold errorthreshold“ “ε ε” ” in in quality.Givenanyinfeasiblepoints quality.Givenanyinfeasiblepoints suchthatr<s<q: suchthatr<s<q:

  • If(Quality

If(Qualityq

q –

– Quality Qualityr

r)/Quality

)/Qualityq

q ≤

≤ ε ε, , thenscanusetheloadsheddingplan thenscanusetheloadsheddingplan forr, forr,withaminormodification withaminormodification. .

VLDB 2007, Vienna 13 Nesime Tatbul, ETH Zurich

r=(50, 50) q=(100, 100) s=(60, 75)

r1 r2 100 100 " "

Inputratespace Inputratespace

slide-14
SLIDE 14

A A

r1 r2 100 100 " "

K I B

AdvancePlanningwithanLPSolver AdvancePlanningwithanLPSolver

QuadTree QuadTree5 5basedPlanIndex basedPlanIndex

VLDB 2007, Vienna 14 Nesime Tatbul, ETH Zurich

  • Usea

UseaRegionQuadTree RegionQuadTreetodivideandindextheinputratespace. todivideandindextheinputratespace.

B C D E F G H I J K L M r=(50, 50) q=(100, 100) E G D M J L F H C

slide-15
SLIDE 15

AdvancePlanningwithanLPSolver AdvancePlanningwithanLPSolver

ExploitingNon ExploitingNon5 5uniformInputWorkload uniformInputWorkload

  • Infeasiblepointsmaybeobservedwithdifferent

Infeasiblepointsmaybeobservedwithdifferent probabilities,i.e.,someregionsmayhavehigher probabilities,i.e.,someregionsmayhavehigher expectedprobability. expectedprobability.

  • Givenaregionwithexpectedprobabilityp,

Givenaregionwithexpectedprobabilityp, theexpectedmaximumerrorforthisregionis: theexpectedmaximumerrorforthisregionis:

  • E[Error

E[Errormax

max]=p*(Quality

]=p*(Qualitymax

max –

– Quality Qualitymin

min)/Quality

)/Qualitymax

max

  • Forallregions,wemustmakesurethat:

Forallregions,wemustmakesurethat:

  • Total(E[Error

Total(E[Errormax

max])

])≤ ≤ ε ε

VLDB 2007, Vienna 15 Nesime Tatbul, ETH Zurich

slide-16
SLIDE 16

TalkOutline TalkOutline

  • ProblemIntroduction

ProblemIntroduction

  • ApproachOverview

ApproachOverview

  • AdvancePlanningwithanLPSolver

AdvancePlanningwithanLPSolver

  • AdvancePlanningwithFIT

AdvancePlanningwithFIT

  • PerformanceResults

PerformanceResults

  • RelatedWork

RelatedWork

  • ConclusionsandFutureWork

ConclusionsandFutureWork

VLDB 2007, Vienna 16 Nesime Tatbul, ETH Zurich

slide-17
SLIDE 17

AdvancePlanningwithFIT AdvancePlanningwithFIT

FITBasics FITBasics

  • WestorefeasiblepointsinFIT:

WestorefeasiblepointsinFIT:

  • (r

(r1

1,r

,r2

2,[localplan],quality)

,[localplan],quality)

  • FIT

FIT= =basedloadshedding: basedloadshedding:

  • Givenaninfeasiblepointp,pmustbemapped

Givenaninfeasiblepointp,pmustbemapped tothe tothehighestquality highestquality feasiblepointqinFIT, feasiblepointqinFIT, suchthatq suchthatq≤ ≤ p. p.

  • WestoreareducednumberofFITpointsby:

WestoreareducednumberofFITpointsby:

  • exploitingthe

exploitingthe“ “ε ε” ” errortolerancethreshold,and errortolerancethreshold,and

  • nlyincludingtheFITpointsthatare
  • nlyincludingtheFITpointsthatare“

“close close” ” to to thefeasibilityboundary. thefeasibilityboundary.

VLDB 2007, Vienna 17 Nesime Tatbul, ETH Zurich

r1 r2 r2 r1 " "

p q

slide-18
SLIDE 18
  • Complementarylocalloadsheddingplansmaybeneededfor

Complementarylocalloadsheddingplansmaybeneededfor nodeswith nodeswithoperatorsplits

  • peratorsplits.

.

  • Example:

Example:

  • Localplansarenotpropagatedupstream.

Localplansarenotpropagatedupstream.

AdvancePlanningwithFIT AdvancePlanningwithFIT

ComplementaryLocalPlans ComplementaryLocalPlans

VLDB 2007, Vienna 18 Nesime Tatbul, ETH Zurich

r

1 2 5 Shedherefirst! Shedherefirst! Shedherefirst!

slide-19
SLIDE 19

H E

AdvancePlanningwithFIT AdvancePlanningwithFIT

QuadTree QuadTree5 5basedPlanIndex basedPlanIndex

VLDB 2007, Vienna 19 Nesime Tatbul, ETH Zurich

  • Usea

UseaPointQuadTree PointQuadTree todivideandindextheinputratespace. todivideandindextheinputratespace.

C D E A B G H F A C B D F G

r1 r2 !

slide-20
SLIDE 20

TalkOutline TalkOutline

  • ProblemIntroduction

ProblemIntroduction

  • ApproachOverview

ApproachOverview

  • AdvancePlanningwithanLPSolver

AdvancePlanningwithanLPSolver

  • AdvancePlanningwithFIT

AdvancePlanningwithFIT

  • PerformanceResults

PerformanceResults

  • RelatedWork

RelatedWork

  • ConclusionsandFutureWork

ConclusionsandFutureWork

VLDB 2007, Vienna 20 Nesime Tatbul, ETH Zurich

slide-21
SLIDE 21

ExperimentalSetup ExperimentalSetup

VLDB 2007, Vienna 21 Nesime Tatbul, ETH Zurich

  • ImplementedonBorealis

ImplementedonBorealis

  • QuerynetworkswithDelay(cost,selectivity)operators

QuerynetworkswithDelay(cost,selectivity)operators

  • Twoinputworkloads:

Twoinputworkloads:

  • Synthetic:Exponentialdistribution

Synthetic:Exponentialdistribution

  • NetworktraffictracesfromtheInternetTrafficArchive

NetworktraffictracesfromtheInternetTrafficArchive

  • Goals:

Goals:

  • Analyzeplangenerationefficiencyfor

Analyzeplangenerationefficiencyfor

  • Solver,Solver

Solver,Solver= =W,C W,C= =FIT FIT

  • Analyzecommunicationoverheadfor

Analyzecommunicationoverheadfor

  • D

D= =FIT FIT

slide-22
SLIDE 22

PlanGenerationPerformance PlanGenerationPerformance

Solvervs.C Solvervs.C5 5FIT FIT

VLDB 2007, Vienna 22 Nesime Tatbul, ETH Zurich

2x2querynetworkswithdifferentcostdistributions (MaximumErrorε =0.05)

slide-23
SLIDE 23

PlanGenerationPerformance PlanGenerationPerformance

Solvervs.Solver Solvervs.Solver5 5W W

VLDB 2007, Vienna 23 Nesime Tatbul, ETH Zurich

2x2querynetworkswithdifferentcostdistributions (ExpectedMaximumErrorE[ε]~0.015)

slide-24
SLIDE 24

D D5 5FITCommunicationOverhead FITCommunicationOverhead

EffectofQueryCostandErrorTolerance EffectofQueryCostandErrorTolerance

VLDB 2007, Vienna 24 Nesime Tatbul, ETH Zurich

2x2querynetworkswithdifferentquerycosts

slide-25
SLIDE 25

D D5 5FITCommunicationOverhead FITCommunicationOverhead

SensitivitytoSelectivityChange SensitivitytoSelectivityChange

VLDB 2007, Vienna 25 Nesime Tatbul, ETH Zurich

2x2querynetworkswithdifferentoperatorselectivity (Initialselectivity=1.0)

slide-26
SLIDE 26

TalkOutline TalkOutline

  • ProblemIntroduction

ProblemIntroduction

  • ApproachOverview

ApproachOverview

  • AdvancePlanningwithanLPSolver

AdvancePlanningwithanLPSolver

  • AdvancePlanningwithFIT

AdvancePlanningwithFIT

  • PerformanceResults

PerformanceResults

  • RelatedWork

RelatedWork

  • ConclusionsandFutureWork

ConclusionsandFutureWork

VLDB 2007, Vienna 26 Nesime Tatbul, ETH Zurich

slide-27
SLIDE 27

RelatedWork RelatedWork

  • Loadsheddingforthesingleservercase.Examples:

Loadsheddingforthesingleservercase.Examples:

[Tatbuletal,VLDB [Tatbuletal,VLDB’ ’03/VLDB 03/VLDB’ ’06],[Babcocketal,ICDE 06],[Babcocketal,ICDE’ ’04] 04] [Ayadetal,SIGMOD [Ayadetal,SIGMOD’ ’04],[Reissetal,ICDE 04],[Reissetal,ICDE’ ’05], 05],… …

  • Control

Control= =basedloadmanagementinSystemS basedloadmanagementinSystemS

[Aminietal,ICDCS [Aminietal,ICDCS’ ’06] 06]

  • AggregatecongestioncontrolagainstDoSattacks

AggregatecongestioncontrolagainstDoSattacks

[Mahajanetal,SIGCOMMCCR [Mahajanetal,SIGCOMMCCR’ ’02] 02]

  • Parametricqueryoptimization

Parametricqueryoptimization

[Ioannidiesetal,VLDB [Ioannidiesetal,VLDB’ ’92],[Ganguly,VLDB 92],[Ganguly,VLDB’ ’98],[Hulgeri 98],[Hulgeri etal,VLDB etal,VLDB’ ’02] 02]

VLDB 2007, Vienna 27 Nesime Tatbul, ETH Zurich

slide-28
SLIDE 28

Conclusions Conclusions

  • Distributedloadsheddingrequirescoordination

Distributedloadsheddingrequirescoordination amongtheservers. amongtheservers.

  • Weprovidecentralizedanddistributedalternatives.

Weprovidecentralizedanddistributedalternatives.

  • Weproposeefficienttechniquesforadvance

Weproposeefficienttechniquesforadvance generationofloadsheddingplans: generationofloadsheddingplans:

  • Approximateloadsheddingplans

Approximateloadsheddingplans

  • QuadTree

QuadTree= =basedplanindexing basedplanindexing

  • Exploitinginputworkloaddistribution

Exploitinginputworkloaddistribution

  • DistributedFITisbetterfordynamicenvironments.

DistributedFITisbetterfordynamicenvironments.

VLDB 2007, Vienna 28 Nesime Tatbul, ETH Zurich

slide-29
SLIDE 29

FutureWork FutureWork

  • Performanceonlargerscalenetworks

Performanceonlargerscalenetworks

  • Bandwidthbottlenecks

Bandwidthbottlenecks

  • Non

Non= =treeservertopologies treeservertopologies

  • Hybridapproaches

Hybridapproaches

  • Centralized+Distributed

Centralized+Distributed

  • Localplanrefinement

Localplanrefinement

  • Otherqualitymetrics

Otherqualitymetrics

VLDB 2007, Vienna 29 Nesime Tatbul, ETH Zurich

slide-30
SLIDE 30

http://www.cs.brown.edu/research/borealis/ http://www.cs.brown.edu/research/borealis/ http://www.inf.ethz.ch/~tatbul http://www.inf.ethz.ch/~tatbul

Questions? Questions?

VLDB 2007, Vienna 30 Nesime Tatbul, ETH Zurich

Moreinformation: Moreinformation:

slide-31
SLIDE 31

AdvancePlanningwithFIT AdvancePlanningwithFIT

UpstreamFITPropagation UpstreamFITPropagation

  • EachleafnodegeneratesitsFITfromscratch,and

EachleafnodegeneratesitsFITfromscratch,and propagatesittoitsupstreamparent. propagatesittoitsupstreamparent.

  • Eachnon

Eachnon= =leafnode,uponreceivingFITsfromits leafnode,uponreceivingFITsfromits children: children:

1. 1. MapstheFITratesfromitsoutputstoitsowninputs MapstheFITratesfromitsoutputstoitsowninputs ( (Note: Note: Mappingacrosssplitsmayresultinlocalplans). Mappingacrosssplitsmayresultinlocalplans). 2. 2. MergesmultipleFITsintoasingleFIT. MergesmultipleFITsintoasingleFIT. 3. 3. RemovestheFITentriesthatareinfeasibleforitself. RemovestheFITentriesthatareinfeasibleforitself. 4. 4. PropagatestheresultingFITfurtherupstream. PropagatestheresultingFITfurtherupstream.

VLDB 2007, Vienna 31 Nesime Tatbul, ETH Zurich

slide-32
SLIDE 32

PlanGenerationPerformance PlanGenerationPerformance

EffectofInputDimensionality(C EffectofInputDimensionality(C5 5FIT) FIT)

VLDB 2007, Vienna 32 Nesime Tatbul, ETH Zurich

2x2,4x2,8x2querynetworkswiththesametotalcost