Interactive Historical Data Analysis Javi Carretero, Technical - - PowerPoint PPT Presentation

interactive historical data analysis
SMART_READER_LITE
LIVE PREVIEW

Interactive Historical Data Analysis Javi Carretero, Technical - - PowerPoint PPT Presentation

Interactive Historical Data Analysis Javi Carretero, Technical Architect TrendMiner 04/06/19 Plan Introduce TrendMiner Discuss context & user needs TrendMiner 1.0 TrendMiner 2.0 (with Apache Ignite ) Challenges


slide-1
SLIDE 1

Interactive Historical Data Analysis

Javi Carretero, Technical Architect
 TrendMiner
 04/06/19

slide-2
SLIDE 2

2

Plan

❖ Introduce TrendMiner ❖ Discuss context & user needs ❖ TrendMiner 1.0 ❖ TrendMiner 2.0 (with Apache Ignite) ❖ Challenges ❖ Future work

slide-3
SLIDE 3

3

TrendMiner

Empower process and asset experts with advanced analytics to Analyse, Monitor and Predict the operational performance of batch, grade and continuous manufacturing processes. We democratise analytics by giving insights to the people who need answers: the engineers and operators in the plant.

Data Science Domain Expertise

slide-4
SLIDE 4

4

TrendMiner

slide-5
SLIDE 5

5

TrendMiner

Time Series Descriptive Analytics Predictive Analytics Modelling

slide-6
SLIDE 6

6

TrendMiner

Context & Scale ❖ > 300M points per time series ❖ 10-40K active time series ❖ Source of data is generally very slow! Responsiveness Overall performance User Expectations ❖ Time to first result < 1s ❖ Higher resolution ❖ More active time series ❖ More advanced analytics

slide-7
SLIDE 7

TrendMiner 1.0

Focus on responsiveness (making TrendMiner more interactive)

slide-8
SLIDE 8

8

TrendMiner 1.0

TrendMiner Source File-based Algorithms

Streaming back to UI Fast for small queries Not scalable Not scalable Slow for big queries

slide-9
SLIDE 9

TrendMiner 2.0

Focus on performance (making TrendMiner more efficient)

slide-10
SLIDE 10

10

TrendMiner 2.0 - Caching

Time Series Time Slices t IgniteCache<Key, Data> t0 tN t0 t1 t2 t3 t4 t5 t6 tN startDate (ti) endDate (tj) timeSeriesId [ts0, value0] [ts1, value1] [ts2, value2]

slide-11
SLIDE 11

11

TrendMiner 2.0 - Caching

S1 S2 S3 S4 S5 S6 S7 2019-06-04 16:20:15.165 2019-06-04 16:20:15.165 2019-06-04 16:20:15.165 2019-06-04 16:20:15.165 2019-06-04 16:20:15.165 Point timestamp Slice by hour Slice by day Slice by month Slice by year Scalability Performance Time Slices

slide-12
SLIDE 12

12

TrendMiner 2.0 - Affinity

S1 S2 S3 S4 S5 S6 S7 Time Series X S1 S2 S3 S4 S5 S6 S7 Time Series Y

Si Nodej

(all Time Series)

e.g: 2019-06-04

slide-13
SLIDE 13

13

TrendMiner 2.0 - Compute Grid

Existing algorithms Chronological swipe

  • B. Scatter "jobs" = affinityCall

S1 S2 S3 S4 S5 S6 S7 S1 S3 S2 S6 S4 S7 S5

Data Point Data Point (meeting criteria)

Scalable algorithms = IgniteCompute

  • A. Split search (affects N slices)
  • C. Chronological swipe (single slice)
  • D. Post-process partial results (e.g. merging)
slide-14
SLIDE 14

14

TrendMiner 2.0 - Result

TrendMiner Algorithms

Streaming back to UI Fast for small queries Fast for big queries

Source

Scalable Scalable

slide-15
SLIDE 15

15

Challenge - Multi-level Prioritisation

Search dimensions ❖ Time Series (single vs multiple series) ❖ Search window (single vs multiple time slices) ❖ Algorithm (visualisation vs descriptive vs predictive analytics)

CPU usage Ignite jobs Urgency

slide-16
SLIDE 16

16

Challenge - Multi-level Prioritisation

Ignite Capabilities ❖ PriorityQueueCollisionSpi (grid.task.priority) = 1 dimension! MultiLevelPriorityQueueCollisionSpi (custom implementation) ❖ Still use grid.task.priority ❖ Priority degression factor = #(Time Series) X #(Time Slices) ❖ Urgency via "Service Levels" (0...N) = 2nd dimension

slide-17
SLIDE 17

17

Multi-level Prioritisation (Example)

Job queue Compute node

slide-18
SLIDE 18

18

Multi-level Prioritisation (Example)

Compute node Compute thread Historical search = Service Level 1 Job queue Queued tasks have a priority (degression factor)

slide-19
SLIDE 19

19

Multi-level Prioritisation (Example)

New computation (max urgency) = Service Level 0 Compute node Compute thread Job queue

slide-20
SLIDE 20

20

Multi-level Prioritisation (Example)

Compute node Compute thread Job queue New computation (max urgency) = Service Level 0

slide-21
SLIDE 21

21

Multi-level Prioritisation (Example)

Compute node Compute thread Job queue

slide-22
SLIDE 22

22

Multi-level Prioritisation (Example)

Compute node Compute thread Job queue

slide-23
SLIDE 23

23

Multi-level Prioritisation (Example)

Compute node Compute thread Job queue Historical search = Service Level 1

slide-24
SLIDE 24

24

Multi-level Prioritisation (Example)

Compute node Compute thread Job queue Historical search = Service Level 1 First task > priority (no degression factor applied)

slide-25
SLIDE 25

25

Future Work

❖ Improve scheduling efficiency (predictable job runtime) ❖ Prevent job starvation (e.g. job-stealing SPIs) ❖ Make all algorithms scalable ❖ Pave way for Ignite Native Persistence

slide-26
SLIDE 26

Thank you!

javi.carretero@trendminer.com