Synergy: Quality of Service Synergy: Quality of Service Support for - - PowerPoint PPT Presentation

synergy quality of service synergy quality of service
SMART_READER_LITE
LIVE PREVIEW

Synergy: Quality of Service Synergy: Quality of Service Support for - - PowerPoint PPT Presentation

Synergy: Quality of Service Synergy: Quality of Service Support for Distributed Support for Distributed Stream Processing Systems Stream Processing Systems Thomas Repantis Department of Computer Science & Engineering University of


slide-1
SLIDE 1

Synergy: Quality of Service Synergy: Quality of Service Support for Distributed Support for Distributed Stream Processing Systems Stream Processing Systems

Thomas Repantis

Department of Computer Science & Engineering University of California, Riverside trep@cs.ucr.edu http://www.cs.ucr.edu/~trep/

slide-2
SLIDE 2

Thomas Repantis 2/45

Research Contributions

Distributed Stream Processing Systems

Sharing-Aware Component Composition [Middleware’06, TPDS'08 (rev.)] Load Prediction and Hot-Spot Alleviation [DSN’08, DBISP2P’07] Replica Placement for High Availability [DEBS'08]

Management of Large-Scale, Distributed, Real-Time Applications

Adaptation to Resource Availability [IPDPS’05] Fair Resource Allocation [ISORC’06, WPDRTS’05]

Peer-to-Peer Systems

Adaptive Data Dissemination and Routing [MDM’05] Decentralized Trust Management [MPAC’06]

Software Distributed Shared Memory Systems

Data Migration [Cluster’05, Cluster’04]

Replication in Distributed Multi-Tier Architectures [IBM’07] Collaborative Spam Filtering [Intel’06] Distributed Logging for Asynchronous Replication [HP’05]

slide-3
SLIDE 3

Thomas Repantis 3/45

On-Line Data Stream Processing

Network traffic monitoring for intrusion detection Customization of multimedia

  • r news feeds

Analysis of readings coming from sensors or mobile robots Click stream analysis for purchase recommendations

  • r advertisements
slide-4
SLIDE 4

Thomas Repantis 4/45

Distributed Stream Processing System

High volume data streams (sensor data, financial data, media data) Extracted result streams Filter Aggregation Correlation Clustering

Real-time online processing functions/ Continuous query operators

slide-5
SLIDE 5

Thomas Repantis 5/45

Stream Processing Environment

Streams are processed online by components distributed across hosts Data arrive in large volumes and high rates, while workload spikes are not known in advance Stream processing applications have QoS requirements, e.g., e2e delay

Split Select Join Select

slide-6
SLIDE 6

Thomas Repantis 6/45

QoS for Distributed Stream Processing Applications

Our goal: How to run stream processing applications with QoS requirements, while efficiently managing system resources

Share existing result streams Share existing stream processing components Predict QoS violations Alleviate hot-spots Maximize availability

Benefits

Enhanced QoS provision Reduced resource load

Challenges

Concurrent component sharing Highly dynamic environment On-demand stream application requests Scale that dictates decentralization

slide-7
SLIDE 7

Thomas Repantis 7/45

Roadmap

Motivation and Background Synergy Architecture Design and Algorithms

Component Composition

Composition Protocol Component and Stream Sharing

Load Balancing

Hot-Spot Prediction Hot-Spot Alleviation

High Availability

Replica Placement

Conclusion Demo

slide-8
SLIDE 8

Thomas Repantis 8/45

Synergy Middleware

A middleware managing the mappings:

From application layer to stream processing

  • verlay layer

From stream processing

  • verlay layer

to physical resource layer

slide-9
SLIDE 9

Thomas Repantis 9/45

Metadata Layer Over a DHT

Decouples stream and component placement from their discovery Stream and component names are hashed in a DHT DHT maps the hashed names to nodes currently offering the specified stream or component

slide-10
SLIDE 10

Thomas Repantis 10/45

Synergy Node Architecture

  • Application Composition

and QoS Projection instantiate applications

  • Replica Placement places

components

  • Load Balancing and Load

Prediction detect hot-spots

  • Migration Engine alleviates

hot-spots

  • Monitor measures

processor and bandwidth

  • Discovery locates streams

and components

  • Routing transfers

streaming data

slide-11
SLIDE 11

Thomas Repantis 11/45

Component Composition

C1 C6 C5 C3 C4 C2

Destination Source

Application Component Graph

O1 O6 O5 O3 O4 O2

Destination Source

QoS Requirements Query Plan

+

Synergy Middleware

slide-12
SLIDE 12

Thomas Repantis 12/45

Composition Probes

Carry query plan, resource, and QoS requirements Collect information about:

Resource availability End-to-end QoS QoS impact on existing applications

O1 O2

Source Destination

C1 C2 C3 C4

slide-13
SLIDE 13

Thomas Repantis 13/45

Composition Protocol

Input Query Plan

  • Stream application

template

  • QoS requirements
  • Resource requirements

Output Application Component Graph

  • Satisfy QoS and resource

requirements

  • Reuse streams and

components without QoS violations

  • Achieve load balancing

C C C C C C C C C C C C C C C C C C

source sink

probe probe probe probe probe probe probe probe probe probe probe probe probe probe

O1 O2 O3 O4 O5 O6

slide-14
SLIDE 14

Thomas Repantis 14/45

Composition Selection

All successful probes returning to source have been checked against constraints on:

Operator functions Processing capacity Bandwidth QoS

The most load balanced one is selected among all qualified compositions by minimizing:

slide-15
SLIDE 15

Thomas Repantis 15/45

Component Sharing

QoS Impact Projection Algorithm

All existing and the new application should not exceed requested execution time: Impact estimated using a queueing model for the execution time:

slide-16
SLIDE 16

Thomas Repantis 16/45

Stream Sharing

Maximum Sharing Discovery Algorithm

Breadth first search on query plan to identify latest possible existing output streams Backtracking hop-by-hop, querying the metadata layer

slide-17
SLIDE 17

Thomas Repantis 17/45

Experimental Setup

PlanetLab multi-threaded prototype of about 35000 lines of Java running on 88 PlanetLab nodes Simulator of about 8500 lines of C++ for 500 random nodes of a GT-ITM topology of 1500 routers 5 replicas of each component Synergy vs Random, Greedy, and Composition

slide-18
SLIDE 18

Thomas Repantis 18/45

Composition Performance

Stream reuse improves end-to-end delay by saving processing time and increases system capacity

slide-19
SLIDE 19

Thomas Repantis 19/45

Composition Overhead

Stream reuse decreases probing overhead and setup time

slide-20
SLIDE 20

Thomas Repantis 20/45

Performance on Simulator

End-to-end delay scales due to stream reuse and QoS impact projection

slide-21
SLIDE 21

Thomas Repantis 21/45

Sensitivity on Simulator

Synergy performs consistently better, regardless of QoS strictness or query popularity

slide-22
SLIDE 22

Thomas Repantis 22/45

Projection Accuracy

Pessimistic projections for low rate segments may cause conservative compositions but no QoS violations

slide-23
SLIDE 23

Thomas Repantis 23/45

Roadmap

Motivation and Background Synergy Architecture Design and Algorithms

Component Composition

Composition Protocol Component and Stream Sharing

Load Balancing

Hot-Spot Prediction Hot-Spot Alleviation

High Availability

Replica Placement

Conclusion Demo

slide-24
SLIDE 24

Thomas Repantis 24/45

Application-Oriented Load Management

System hot-spots: Overloaded nodes Application hot-spots: QoS violations

Sensitive hot-spot detection

Triggered even when underloaded, if stringent QoS

Fine-grained hot-spot alleviation

Only suffering applications migrate

Proactively prevent QoS degradation

slide-25
SLIDE 25

Thomas Repantis 25/45

Predicting QoS Violations

Calculate slack time ts on every component based

  • n execution time te and communication time tc
slide-26
SLIDE 26

Thomas Repantis 26/45

Execution Time Prediction

Linear regression to bind execution time te and total rate rt

slide-27
SLIDE 27

Thomas Repantis 27/45

Rate Prediction

Auto-correlation Cross-correlation (Pearson Product Moment)

slide-28
SLIDE 28

Thomas Repantis 28/45

Decentralized Load Monitoring

Load updates pushed when intervals change Overlapping intervals absorb frequent changes DHT maps component names to the loads of peers hosting them Peers detect overloads and imbalances between all hosts of a component

slide-29
SLIDE 29

Thomas Repantis 29/45

Alleviating Hot-Spots via Migration

slide-30
SLIDE 30

Thomas Repantis 30/45

Hot-Spot Prediction and Alleviation

Average prediction error 3.7016% Average prediction overhead 0.5984ms

slide-31
SLIDE 31

Thomas Repantis 31/45

Hot-Spot Prediction and Alleviation

Average one migration every three applications Average migration time 1144ms

slide-32
SLIDE 32

Thomas Repantis 32/45

QoS Improvement

As load increases the benefits of hot-spot elimination become evident

slide-33
SLIDE 33

Thomas Repantis 33/45

Roadmap

Motivation and Background Synergy Architecture Design and Algorithms

Component Composition

Composition Protocol Component and Stream Sharing

Load Balancing

Hot-Spot Prediction Hot-Spot Alleviation

High Availability

Replica Placement

Conclusion Demo

slide-34
SLIDE 34

Thomas Repantis 34/45

Component Replication

c11 c32 c31 c21 c41 c42 c22 c12 source destination s1 s2 s3 s2+s3 s5 s4 s5 s4 s6 s6

slide-35
SLIDE 35

Thomas Repantis 35/45

Component Replica Placement

Maximize availability of composite applications

Optimal: Place complete graph on each node

Respect node resource availability

Processing capacity Network bandwidth

Maximize application performance

Inter-operator communication cost (between primaries) Intra-operator communication cost (between primaries and backups)

slide-36
SLIDE 36

Thomas Repantis 36/45

Placement for High Availability

Availability decreases with larger graphs and increases with higher concentration

slide-37
SLIDE 37

Thomas Repantis 37/45

Distributed Placement Protocol

c11 c32 c31 c21 c41 c42 c22 c12 source destination s1 s2 s3 s2+s3 s5 s4 s5 s4 s6 s6

Closest used candidates

slide-38
SLIDE 38

Thomas Repantis 38/45

Replica Placement

Increase availability and performance 5539ms to gather latencies for 30 nodes

slide-39
SLIDE 39

Thomas Repantis 39/45

Related Work

System S: IBM stream processing middleware SBON, SAND, IFLOW: Component placement Borealis, Flux, PeerCQ: Load balancing Borealis, TelegraphCQ: Load shedding Borealis, Flux: Fault tolerance SpiderNet, sFlow: Component composition

slide-40
SLIDE 40

Thomas Repantis 40/45

Synergy: QoS-Enabled Distributed Stream Processing System

Component Composition

Fully distributed composition protocol Reuse existing streams and components

Load Balancing

Predict QoS violations Alleviate hot-spots using migration

High Availability

Place component replicas

Future work

Efficient and consistent replication Adaptive topology management Secure composite applications

Conclusion

slide-41
SLIDE 41

Thomas Repantis 41/45

Demo

TCP traffic trace, LBL, 2 hours, 1.8 million packets [Internet Traffic Archive] Monitor source-destination pairs in top 5% of total traffic

  • ver last 20 minutes [Stream Query Repository]
slide-42
SLIDE 42

Thomas Repantis 42/45

GUI Settings

slide-43
SLIDE 43

Thomas Repantis 43/45

GUI Application

slide-44
SLIDE 44

Thomas Repantis 44/45

GUI Execution

slide-45
SLIDE 45

Thomas Repantis 45/45

Acknowledgements

  • Prof. Vana Kalogeraki, UC Riverside
  • Prof. Xiaohui Gu, NCSU (formerly IBM Research)

Yannis Drougas, UC Riverside Bilson Campana, UC Riverside

http://synergy.cs.ucr.edu/

slide-46
SLIDE 46

Synergy: Quality of Service Synergy: Quality of Service Support for Distributed Support for Distributed Stream Processing Systems Stream Processing Systems

Thomas Repantis

trep@cs.ucr.edu http://www.cs.ucr.edu/~trep/ http://synergy.cs.ucr.edu/