STAD-HD: Spatial Temporal Anomaly Detection for Heterogeneous Data - - PowerPoint PPT Presentation

stad hd spatial temporal anomaly detection for
SMART_READER_LITE
LIVE PREVIEW

STAD-HD: Spatial Temporal Anomaly Detection for Heterogeneous Data - - PowerPoint PPT Presentation

STAD-HD: Spatial Temporal Anomaly Detection for Heterogeneous Data through Visual Analytics Solution for 2016 VAST Challenge MC2 & MC3 Yu Zhang 1 , Guozheng Li 1 , Chufan Lai 1 , Qiangqiang Liu 1 , Shuai Chen 1 , Lu Feng 1 , Tangzhi Ye 1 ,


slide-1
SLIDE 1

STAD-HD: Spatial Temporal Anomaly Detection for Heterogeneous Data through Visual Analytics

Solution for 2016 VAST Challenge MC2 & MC3

Yu Zhang1, Guozheng Li1, Chufan Lai1, Qiangqiang Liu1, Shuai Chen1, Lu Feng1, Tangzhi Ye1, Siming Chen1, Ren Zuo1, Zhuo Zhang2, Zhanyi Wang2, Xin Huang2, Fengchao Xu2, Li Yu2, Shunlong Zhang2, Qiusheng Li2, Xiaoru Yuan1

1Peking University and 2Qihoo 360 Co. Ltd.

slide-2
SLIDE 2

2

  • Data description
  • System introduction
  • Cases
  • Conclusion

Outline

slide-3
SLIDE 3

3

  • GAStech company moved to a three-storey building
  • 125 employees
  • 41 energy zones vs. 23 prox zones
  • 14 days’ static data + 60 hours’ streaming data

Data description

slide-4
SLIDE 4

4

  • Building data
  • Energy zone (≤12 attributes)
  • Floor (≤11 attributes)
  • Building (16 attributes)
  • Prox data
  • Mobile sensor
  • Fixed sensors

Data description

slide-5
SLIDE 5

5

  • Prox data

Data description

Prox card Prox zone Robot

slide-6
SLIDE 6

6

  • Prox data

Data description

Prox-zone detection Time: Accurate Position: Inaccurate Robot detection Time: Inaccurate Position: Accurate

slide-7
SLIDE 7

7

  • Typical patterns
  • Notable anomalies
  • Relationships between two types of data

Task

slide-8
SLIDE 8

8

  • Typical patterns
  • Notable anomalies
  • Relationships between two types of data
  • Design requirement: spatial and temporal filters

Task

slide-9
SLIDE 9

9

  • “Pattern”
  • “Anomaly”
  • Not well-defined

Task

slide-10
SLIDE 10

10

Work flow

slide-11
SLIDE 11

11

  • Two systems
  • System A (Labelling System)
  • Basic visualizations + labelling
  • Store insight of the data
  • System B (Analysis System)
  • Exploits the insight from system A to reduce the ambiguity of the tasks
  • Anomaly-detection as entrance

Work flow

slide-12
SLIDE 12

12

System A: Time Series Interface

Spatial filter Attribute filter

slide-13
SLIDE 13

13

Time Series Interface

System A System B Insight 1: Curves in weekdays differs from that in weekends Insight 2: Periodical pattern in weekdays Anomaly detection metric: template of a weekday (show anomaly)

slide-14
SLIDE 14

14

System B: Time Series Interface

Spatial filter Attribute filter Temporal filter Anomaly detection

slide-15
SLIDE 15

15

  • Store all the anomalies with the metadata <time, position, attribute>

System B: Warning Stack

slide-16
SLIDE 16

16

System A: Trajectory Interface

Labels of one trajectory Cross filters All the defined labels

slide-17
SLIDE 17

17

Trajectory Interface

System A System B Insight 1: Physically impossible events / suspicious events Design requirement 1: Give warnings (show anomaly) Insight 2: Meeting events Design requirement 2: Directly visualize the meeting (show pattern)

slide-18
SLIDE 18

18

Trajectory Interface

System A System B Insight 1: Physically impossible events / suspicious events Design requirement 1: Give warnings (show anomaly) Insight 2: Meeting events Design requirement 2: Directly visualize the meeting (show pattern) Anomaly detection metric:

  • 1. Move between zones that are not adjacent - Strong
  • 2. Conflict between two trajectory data source - Strong
  • 3. Staying in a zone that contains neither the office of

the employee nor public area - Weak

slide-19
SLIDE 19

19

System B: Trajectory Interface

Spatial filter Trajectory simulation Gantt chart

slide-20
SLIDE 20

20

  • Data source
  • Fixed prox sensor - spatial uncertainty
  • Mobile prox sensor - temporal uncertainty
  • Uncertainty reduction
  • Robot detection
  • Stay in office
  • Stay in public areas (e.g. meeting room)
  • Stay in the corridor

System B: Trajectory Interface

slide-21
SLIDE 21

21

System B: Streaming Data Interface

slide-22
SLIDE 22

22

System B: Streaming Data Interface

Position-centric glyphs Attribute-centric glyphs Gantt Chart

slide-23
SLIDE 23

23

Case 1: Shifts

slide-24
SLIDE 24

24

Case 2: Meetings

Information technology and engineering meet at Mtg/Training (2700) Facilities meet at Conf (2365)

slide-25
SLIDE 25

25

Case 3: Lost card

Lost card used New card

slide-26
SLIDE 26

26

Case 4: Server Room down time

Cooling setpoint goes up Air temperature goes up Equipment power goes down

slide-27
SLIDE 27

27

  • STAD-HD: twin systems for specifying and visualizing patterns and

anomalies in heterogeneous dataset

  • Labelling System for insight + Analysis System for answers
  • Drawbacks
  • Labelling System cannot automatically generate insights
  • Transformation from insight to design requirements and finally to

implementation is not automatic

  • Analysis System reports many false positives

Conclusion

slide-28
SLIDE 28

28

  • Funding
  • NSFC No. 61170204
  • NSFC Key Project No. 61232012
  • National Program on Key Basic Research Project (973

Program) No. 2015CB352500

  • Reviewers
  • Anonymous Reviewers

Acknowledgement

slide-29
SLIDE 29

29

Thank You