SLIDE 63 THE LONG ROAD TOWARDS ELASTIC DISTRIBUTED STREAM PROCESSING - AUTODASP 2018
SENSITIVITY TO LOAD IMBALANCE
Solution: [Rivetti et al., DEBS 2015]
▪Ad-hoc mapping of heavy hitters AND groups of sparse items.
HH
991 key values make up for roughly 62% of the stream
–
Sparse Items do not cause unbalance Handle Sparse Items with the standard solution
Non frequent key values (Sparse Items)
1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000
Probability key values
Skewed key value distribution
probability distribution
HH
SI1 SI2 SI3 SI4
HH HH HH HH
Instance 2 Instance 1 Instance 4 Instance 3
–
248 key values 248 keys values 248 keys values 248 keys values
Worst Case partitioning
1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000
Probability key values
Skewed key value distribution
probability distribution
HH HH HH HH
Instance 2 Instance 1 Instance 4 Instance 3
SI1 SI2 SI3 SI4
–
248 key values 248 key values 248 key values
HH
SI1 SI2 SI3 SI4
248 key values 248 keys values 248 keys values 248 keys values
Worst Case partitioning
248 key values
1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000
Probability key values
Skewed key value distribution
probability distribution
HH
991 key values make up for roughly 62% of the stream
Each single key value does not cause unbalance
–
1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000
Probability key values
Skewed key value distribution
probability distribution
HH2
Instance 2 Instance 1 Instance 4 Instance 3
HH1 HH3 HH4 HH5 HH6 HH7 HH8 HH9 SI1
SI4 SI5 SI8 SI2 SI3 SI6 SI7
–
124 key values 497 key values 374 key values 5 key values
HH1 HH2 HH3 HH4 HH5 HH6 HH7 HH8 HH9 SI1
SI4 SI5 SI8 SI2 SI3 SI6SI7
124 key values 124 key values 124 key values 124 key values 124 key values 124 key values
Worst Case partitioning
124 key values 124 key values
1.E-04 1.E-03 1.E-02 1.E-01 1 10 100 1000
Probability key values
Skewed key value distribution
probability distribution