Real-Time Edge Computing Chenyang Lu Industrial Internet of Things - - PowerPoint PPT Presentation

real time edge computing
SMART_READER_LITE
LIVE PREVIEW

Real-Time Edge Computing Chenyang Lu Industrial Internet of Things - - PowerPoint PPT Presentation

Real-Time Edge Computing Chenyang Lu Industrial Internet of Things (IIoT) Synergizing sensing, analytics, and control Cloud computing for high capacity Edge computing for timely performance Condition monitoring, Cloud Emergency


slide-1
SLIDE 1

Real-Time Edge Computing

Chenyang Lu

slide-2
SLIDE 2

Industrial Internet of Things (IIoT)

Ø Synergizing sensing, analytics, and control

ü Cloud computing for high capacity ü Edge computing for timely performance

2

Wireless sensor network (e.g., in a wind farm) Private cloud for training and storage Machine learning training

...

...

Applications

...

IIoT services Edge 1

Database

...

Edge 2 Edge N

...

Cloud

Condition monitoring, Emergency response, Predictive maintenance, …

slide-3
SLIDE 3

Research challenge #1: timeliness

Ø Timing constraints:

q IIoT applications have latency requirements q Events carrying physical data have temporal semantics

3

Image source: https://www.maintwiz.com/what-is-condition-monitoring/

Application example: condition monitoring

slide-4
SLIDE 4

Research challenge #1: timeliness

Ø Timing constraints:

q IIoT applications have latency requirements q Events carrying physical data have temporal semantics

4

Image source: https://www.maintwiz.com/what-is-condition-monitoring/

Application example: condition monitoring

Contribution #1: Cyber-Physical Event Processing Architecture

  • latency differentiation
  • time consistency enforcement
slide-5
SLIDE 5

Research challenge #2: loss-tolerance

Ø An IIoT service must deliver messages reliably, but

q fault-tolerant systems can be slow or costly q heterogeneous traffic and platforms can increase pessimism

5 Primary service Backup service edge applications IIoT devices cloud applications

slide-6
SLIDE 6

Research challenge #2: loss-tolerance

Ø An IIoT service must deliver messages reliably, but

q fault-tolerant systems can be slow or costly q heterogeneous traffic and platforms can increase pessimism

6 Primary service Backup service edge applications IIoT devices cloud applications

Contribution #2: Fault-Tolerant Real-Time Messaging Architecture

  • co-scheduling fault-tolerant real-time activities
  • traffic/platform-aware service configuration
slide-7
SLIDE 7

Research challenge #3: efficiency

Ø Efficiency atop loss-tolerance and timeliness:

q costly to backup many in-band small computations q costly to recompute for fault recovery

7

Image source: https://aws.amazon.com/lambda/

Example of in-band computations: AWS Lambda function for IIoT inference

slide-8
SLIDE 8

Research challenge #3: efficiency

Ø Efficiency atop loss-tolerance and timeliness:

q costly to backup many in-band small computations q costly to recompute for fault recovery

8

Image source: https://aws.amazon.com/lambda/

Example of in-band computations: AWS Lambda function for IIoT inference

Contribution #3: Adaptive Real-Time Reliable Edge Computing

  • selective lazy data replication
  • proactive cleanup of obsolete data
slide-9
SLIDE 9

Contributions

Ø Three new IIoT middleware design and implementations:

q Real-time cyber-physical event processing (CPEP) q Fault-tolerant real-time messaging (FRAME) q Adaptive real-time reliable edge computing (ARREC) 9

All have been implemented and validated within the TAO real-time event service [1].

[1] Harrison, T.H., Levine, D.L. and Schmidt, D.C., 1997. The design and performance of a real-time CORBA event service. ACM SIGPLAN Notices, 32(10), pp.184-200.

efficiency t i m e l i n e s s loss-tolerance efficiency t i m e l i n e s s loss-tolerance efficiency t i m e l i n e s s loss-tolerance efficiency t i m e l i n e s s loss-tolerance

CPEP FRAME ARREC

Subscription & Filtering Supplier Proxies Event Correlation Dispatching Consumer Proxies

  • riginal TAO
slide-10
SLIDE 10

Outline

Ø CPEP: real-time cyber-physical event processing Ø FRAME: fault-tolerant real-time messaging Ø ARREC: adaptive real-time reliable edge computing

10

efficiency t i m e l i n e s s loss-tolerance

with CPEP

CPEP

Supplier Proxies Consumer Proxies Subscription & Filtering Supplier Proxies Event Correlation Dispatching Consumer Proxies

  • riginal TAO
slide-11
SLIDE 11

Cyber-physical event processing model

Ø Temporal semantics

q Absolute time consistency

  • A bound on an event’s elapse time since its creation

q Relative time consistency

  • A bound on the difference between events’ creation times

11 Oi: operations (filtering, transformation, encryption, …)

IIoT devices IIoT event service

  • 7
  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

s1 s2 s3 s5 s4

High priority Middle priority Low priority

c1 c2 c3 c4

Low priority

IIoT applications

slide-12
SLIDE 12

Real-time event processing

Ø Processing in the order of priorities propagated from application: Ø Temporal semantics enforcement and shedding:

q Absolute time consistency q Relative time consistency

  • Track both the earliest and the latest event creations, per operator

12

  • 7
  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

s1 s2 s3 s5 s4

High priority Middle priority Low priority

c1 c2 c3 c4

Low priority S2 S1 S3 C2 t1 t2 t3 t4 t5 t7 t6

slide-13
SLIDE 13

The CPEP processing architecture

13

Both workers and movers are further prioritized, enabling an appropriate activity ordering.

  • 7
  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

s1 s2 s3 s5 s4

High priority Middle priority Low priority

c1 c2 c3 c4

Low priority

slide-14
SLIDE 14

Enforcing Absolute Time Consistency

Ø Tracking the earliest end time of validity interval Ø Responses to consistency violation

q Marking: deferring the handling to consumers q Shedding: cancelling all subsequent processing 14

  • 7
  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

s1 s2 s3 s5 s4 c1 c2 c3 c4

es1 es2 es3

(Improving efficiency)

slide-15
SLIDE 15

Enforcing Relative Time Consistency

Ø Maintaining an ordered list of events’ timestamp

q One timestamp per event type q Comparing the maximum time difference with validity interval

Ø Responses to consistency violation

q Marking: deferring the handling to consumers q Shedding: cancelling all subsequent processing 15

  • 7
  • 6
  • 5
  • 4
  • 3
  • 2
  • 1

s1 s2 s3 s5 s4 c1 c2 c3 c4

es1 es2 es3 es1

'

(Improving efficiency)

slide-16
SLIDE 16

Experiment design

Ø IIoT workload:

q Filtering q Data transform q Encryption

Ø Test-bed configuration: Ø Comparison baseline:

q Apache Flink streaming processing framework [1]

16

High priority Middle priority Low priority EKF2 FFT1 CAT1 AES2 FFT2 EKF4 FFT3 AES3 EKF1 AES1 c2 c1 c3 s1 s2 s3 s4 s7 s8 50 Hz 200 Hz 100 Hz EKF3 s5 s6

Machine 1 Suppliers Machine 2 CPEP Machine 3 Consumers

[1] https://flink.apache.org

slide-17
SLIDE 17

Latency performance

17

99th percentile latency (unit: ms) CPEP maintained high-priority latency performance as workload increased.

High Middle Low

CPEP differentiated latency according to priority level.

#

slide-18
SLIDE 18

Benefits of shedding inconsistent events

18

Improve the throughput

  • f consistent events.

Save CPU utilization.

slide-19
SLIDE 19

Latency of low-priority processing

Effectiveness of CPEP Sharing

Ø Experiment setup Ø Results of sharing vs. non-sharing

19

High priority Middle priority Low priority EKF3 FFT3 CAT2 AES2 FFT4 c2 s5 s6 EKF4 s7 s8 EKF1 FFT1 CAT1 AES1 FFT2 c1 s1 s2 EKF2 s3 s4 CAT3 AES4 c3 AES3 100 Hz 100 Hz

CPEP sharing helped reduce latency

slide-20
SLIDE 20

Effectiveness of Sharing

Ø Results of sharing vs. non-sharing

20

Latency of low-priority processing CPU utilization

slide-21
SLIDE 21

Outline

Ø CPEP: Real-time cyber-physical event processing Ø FRAME: Fault-tolerant real-time messaging Ø ARREC: Adaptive real-time reliable edge computing

21

efficiency timeliness l

  • s

s

  • t
  • l

e r a n c e

Subscription & Filtering Supplier Proxies Event Correlation Dispatching Consumer Proxies

  • riginal TAO

FRAME

Supplier Proxies Consumer Proxies

slide-22
SLIDE 22

Ø Application-specific requirements to an IIoT service

q

: the tolerable number of consecutive losses for topic i

Message loss-tolerance requirement

22

Image source: https://www.originlab.com/doc/Origin-Help/Math-Inter-Extrapoltate-YfromX

Value of Application examples emergency response; predictive maintenance k > 0 condition monitoring

(Within the tolerable number, applications may use estimates for the missing data.)

slide-23
SLIDE 23

Fault-tolerance model

Ø A crash failure may happen to an IIoT service host (fail-stop) Ø Lost messages may be recovered

1.

via retransmissions from message publishers

2.

via a backup service [1]

23

[1] Budhiraja, N., Marzullo, K., Schneider, F.B. and Toueg, S., 1993. The primary-backup approach. Distributed systems, 2, pp.199-216. Primary service Backup service edge applications IIoT devices cloud applications

slide-24
SLIDE 24

Fault-tolerant real-time processing

Ø Specify provable deadlines for message replication and dispatch Ø Co-schedule replication and dispatch using, e.g., earliest-deadline-first (EDF)

24

Primary service Backup service edge applications IIoT devices cloud applications dispatch replication

slide-25
SLIDE 25

Necessary condition for a message loss

Ø A message may loss only if both

1.

publisher has deleted its copy

2.

a copy of message has not been replicated to the Backup

25

Events between message creation and its delivery:

slide-26
SLIDE 26

Deadlines for dispatch and replication

Ø Deadline for dispatch: Ø Deadline for replication:

26 : # of most-recent messages a publisher can retransmit : loss-tolerance requirement : topic’s sending period : latency requirement

Primary Broker Publisher Subscriber

PB BS

Primary Broker Publisher Backup Broker Subscriber

PB BB

crash

x ( Ni + Li ) Ti

...

The deadline specifications aid to configuration of IIoT traffic/platform parameters. The deadline specifications help in configuring IIoT traffic/platform parameters.

slide-27
SLIDE 27

The FRAME messaging architecture

Ø EDF scheduling to dispatch/replicate a message Ø Suppress replication if the dispatch deadline is smaller Ø Prune dispatched messages

27

Data subscriber Data publisher Primary Broker Data publisher Publishers Data subscriber Subscribers ... ... Message Delivery Message Proxy Replicators Dispatchers Message Delivery Replicators Dispatchers Backup Broker Message Proxy

slide-28
SLIDE 28

Experiment design

Ø IIoT topic configuration: Ø Test-bed deployment: Ø Service configurations:

q FRAME+; FRAME; FCFS; FCFS-

28

ES2 ES1 Primary Broker Backup Broker Edge Subscribers Cloud Subscriber EP1 B1 B2 CS1 Edge Publishers EP2

slide-29
SLIDE 29

Loss tolerance performance

29

Success rate of meeting loss-tolerance requirements

A small increase in Ni can greatly improve performance (FRAME+). FRAME succeeded in assuring loss-tolerance.

slide-30
SLIDE 30

Latency performance for recovery

30

FRAME can mitigate latency penalty. Without suppressing replication, pruning could overload the system. No pruning, however, could cause latency penalty by re- sending obsolete messages.

when the Primary host crashed

slide-31
SLIDE 31

Latency during fault-free operations

31

Observation: Both replication and pruning could delay message dispatching…

Success rate of meeting soft latency requirements

How to improve efficiency ?

All configurations gave similar latency performance.

slide-32
SLIDE 32

Outline

Ø CPEP: Real-time cyber-physical event processing Ø FRAME: Fault-tolerant real-time messaging Ø ARREC: Adaptive real-time reliable edge computing

32

efficiency timeliness l

  • s

s

  • t
  • l

e r a n c e

Subscription & Filtering Supplier Proxies Event Correlation Dispatching Consumer Proxies

  • riginal TAO

ARREC

with ARREC Supplier Proxies Consumer Proxies

slide-33
SLIDE 33

Edge Computing for IIoT

Ø Timely, reliable, and efficient IIoT edge computing

q CPEP: Real-time cyber-physical event processing q FRAME: Fault-tolerant real-time messaging q ARREC: Adaptive real-time reliable edge computing 33

efficiency timeliness l

  • s

s

  • t
  • l

e r a n c e

CPEP FRAME ARREC

Wireless sensor network (e.g., in a wind farm) Private cloud for training and storage Machine learning training

...

...

Applications

...

IIoT services Edge 1

Database

...

Edge 2 Edge N

...

Cloud

slide-34
SLIDE 34

References

Ø C. Wang, C. Gill and C. Lu, Real-Time Middleware for Cyber-Physical Event Processing, ACM Transactions on Cyber-Physical Systems, Special Issue on Real- Time aspects in Cyber-Physical Systems, 3(3), Article 29, August 2019. Ø C. Wang, C. Gill and C. Lu, FRAME: Fault Tolerant and Real-Time Messaging for Edge Computing, IEEE International Conference on Distributed Computing Systems (ICDCS'19), July 2019. Ø C. Wang, C. Gill and C. Lu, Adaptive Data Replication in Real-Time Reliable Edge Computing for Internet of Things, ACM/IEEE International Conference on Internet

  • f Things Design and Implementation (IoTDI'20), April 2020.

34