<Title> Yiqun Hu, SP Group Agenda Condition monitoring - - PowerPoint PPT Presentation

title
SMART_READER_LITE
LIVE PREVIEW

<Title> Yiqun Hu, SP Group Agenda Condition monitoring - - PowerPoint PPT Presentation

Data Council Singapre 2019 Autoencoder Forest for Anomaly Detection from IoT Time Series <Title> Yiqun Hu, SP Group Agenda Condition monitoring & anomaly detection Autoencoder for anomaly detection Autoencoder


slide-1
SLIDE 1

<Title>

Autoencoder Forest for Anomaly Detection from IoT Time Series

Yiqun Hu, SP Group

Data Council Singapre 2019

slide-2
SLIDE 2

Agenda

  • Condition monitoring & anomaly detection
  • Autoencoder for anomaly detection
  • Autoencoder Forest
  • End-to-end workflow
  • Experiment results
slide-3
SLIDE 3

Conditional monitoring & Anomaly Detection

slide-4
SLIDE 4

Condition monitoring

slide-5
SLIDE 5
  • Manual monitoring

– Huge human effort – Boring task with low quality

  • Rule-based method

– Cannot differentiate different environment – Cannot adapt to different condition of the equipment

  • Data-driven method

– Model the common behavior of the equipment

Time-series anomaly detection

slide-6
SLIDE 6

Autoencoder for Anomaly Detection

slide-7
SLIDE 7

Autoencoder

  • What is autoencoder

– A encoder-decoder type of neural network architecture that is used for self-learning from unlabeled data

  • The idea of autoencoder

– Learn how to compress data into a concise representation to allow for the reconstruction with minimum error

  • Different variants of autoencoder

– Variational Autoencoder – LSTM Autoencoder – Etc.

Autoencoder Neural Network

slide-8
SLIDE 8

Autoencoder for anomaly detection

Online Detection Anomaly score Offline Training Reconstruction errors

slide-9
SLIDE 9

Autoencoder Forest

slide-10
SLIDE 10

A key challenge of autoencoder

Single Autoencoder

slide-11
SLIDE 11

The idea of autoencoder forest

x x xx x x x x x

  • o o o o
  • o
  • ++

+ + + + +

slide-12
SLIDE 12

Clustering subsequence is meaningless

[1]. Eamonn Keogh, Jessica Lin, Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research

slide-13
SLIDE 13

Autoencoder forest based on time

0:00 1:00 1:30 22:00 23:30

slide-14
SLIDE 14

Training autoencoder forest

Input Layer Encoder layer 1

(window_size, 1) (window_size/2, 1)

Encoder layer 2

(window_size/4, 1)

Decoder layer 1

(window_size/2, 1)

Decoder Layer 2

(window_size, 1)

  • Structure is fixed for every
  • autoencoder. (try to make it

as generic as possible)

  • Each autoencoder within

forest is independent. So the training is naturally parallelizable

  • Using early stopping

mechanism, the training of individual autoencoder can be stopped at similar accuracy.

slide-15
SLIDE 15

Autoencoder Forest

Single Autoencoder Autoencoder Forest

slide-16
SLIDE 16

End-to-end Workflow

slide-17
SLIDE 17

Automatic end-to-end workflow

Time series analysis Train Data Preprocessing Train Window Extraction Autoencoder Forest Training Test Data Preprocessing Test Window Extraction Anomaly scoring

Training Anomaly detection

slide-18
SLIDE 18

Periodic pattern analysis

  • Automatic determine the

repeating period in time series

– Calculate autocorrelations of different lags – Find the strong local maximum

  • f autocorrelation

– Calculate the interval of any two local maximum – Find the mode of intervala

slide-19
SLIDE 19

Missing data handling

3:05 3:10 3:15 3:20 … … 16:15 16:21 16:24 16:30

… … Misalignment Missing

3:05 3:10 3:15 3:20 … … 16:15 (16:20 – 16:40) 16:45

… … ? ? ?

  • No need to impute
  • If missing gap is small,

impute with neighbouring points;

  • If missing gap is large,

impute with the same time of other periods;

slide-20
SLIDE 20

Anomaly scoring

Extract the sequence window end at time t

. . . . . .

Median profile

Corresponding autoencoder reconstruct the sequence window at time t Compute reconstruction error as anomaly score Learned autoencoder forest

slide-21
SLIDE 21

Experiment Results

slide-22
SLIDE 22

Cooling tower – return water temperature

slide-23
SLIDE 23

Chiller – chilled water return temperature

slide-24
SLIDE 24

Smart meter – half hour consumption

2018-12-03 22:00:00 Normal data 2018-09-27 14:30:00 2018-10-06 22:30:00 2018-09-07 15:30:00 Top 3 Detected Anomaly

slide-25
SLIDE 25
slide-26
SLIDE 26

A common platform for time series data, with built-in AI capabilities

slide-27
SLIDE 27

powering the nation