Interactive Historical Data Analysis Javi Carretero, Technical - PowerPoint PPT Presentation

Interactive Historical Data Analysis Javi Carretero, Technical Architect   TrendMiner   04/06/19

Plan ❖ Introduce TrendMiner ❖ Discuss context & user needs ❖ TrendMiner 1.0 ❖ TrendMiner 2.0 (with Apache Ignite ) ❖ Challenges ❖ Future work � 2

TrendMiner Empower process and asset experts with advanced analytics to Analyse, Monitor and Predict the operational performance of batch, grade and continuous manufacturing processes. We democratise analytics by giving insights to the people who need answers: the engineers and operators in the plant. Data Science Domain Expertise � 3

TrendMiner � 4

TrendMiner Descriptive Analytics Time Series Predictive Analytics Modelling � 5

TrendMiner Context & Scale ❖ > 300M points per time series ❖ 10-40K active time series ❖ Source of data is generally very slow! Responsiveness User Expectations ❖ Time to first result < 1s Overall performance ❖ Higher resolution ❖ More active time series ❖ More advanced analytics � 6

TrendMiner 1.0 Focus on responsiveness (making TrendMiner more interactive)

TrendMiner 1.0 TrendMiner Source File-based Streaming back to UI Not scalable Fast for small queries Algorithms Slow for big queries Not scalable � 8

TrendMiner 2.0 Focus on performance (making TrendMiner more efficient)

TrendMiner 2.0 - Caching t Time Series t 0 t N Time Slices t 2 t 3 t 4 t 0 t 1 t 5 t 6 t N IgniteCache<Key, Data> startDate (t i ) [ts 0 , value 0 ] endDate (t j ) [ts 1 , value 1 ] timeSeriesId [ts 2 , value 2 ] � 10

TrendMiner 2.0 - Caching Time Slices S 2 S 3 S 4 S 1 S 5 S 6 S 7 2019-06-04 16:20:15.165 Point timestamp Slice by hour 2019-06-04 16:20:15.165 Scalability Performance 2019-06-04 16:20:15.165 Slice by day 2019-06-04 16:20:15.165 Slice by month 2019-06-04 16:20:15.165 Slice by year � 11

TrendMiner 2.0 - Affinity Time Series X S 2 S 3 S 4 S 1 S 5 S 6 S 7 Time Series Y S 2 S 3 S 4 S 1 S 5 S 6 S 7 Node j S i (all Time Series) e.g: 2019-06-04 � 12

TrendMiner 2.0 - Compute Grid Data Point Chronological swipe Data Point (meeting criteria) Existing algorithms S 2 S 3 S 4 S 1 S 5 S 6 S 7 S 1 S 4 Scalable algorithms = IgniteCompute S 3 S 7 A. Split search (affects N slices) B. Scatter "jobs" = affinityCall S 2 S 5 C. Chronological swipe (single slice) S 6 D. Post-process partial results (e.g. merging) � 13

TrendMiner 2.0 - Result TrendMiner Source Streaming back to UI Fast for small queries Scalable Algorithms Fast for big queries Scalable � 14

Challenge - Multi-level Prioritisation Search dimensions CPU usage ❖ Time Series (single vs multiple series) ❖ Search window (single vs multiple time slices) Ignite jobs ❖ Algorithm (visualisation vs descriptive vs predictive analytics) Urgency � 15

Challenge - Multi-level Prioritisation Ignite Capabilities ❖ PriorityQueueCollisionSpi ( grid.task.priority ) = 1 dimension! MultiLevelPriorityQueueCollisionSpi (custom implementation) ❖ Still use grid.task.priority ❖ Priority degression factor = #(Time Series) X #(Time Slices) ❖ Urgency via "Service Levels" (0...N) = 2nd dimension � 16

Multi-level Prioritisation (Example) Compute node Job queue � 17

Multi-level Prioritisation (Example) Historical search = Service Level 1 Queued tasks have a priority (degression factor) Compute node Compute thread Job queue � 18

Multi-level Prioritisation (Example) New computation (max urgency) = Service Level 0 Compute node Compute thread Job queue � 19

Multi-level Prioritisation (Example) New computation (max urgency) = Service Level 0 Compute node Compute thread Job queue � 20

Multi-level Prioritisation (Example) Compute node Compute thread Job queue � 21

Multi-level Prioritisation (Example) Compute node Compute thread Job queue � 22

Multi-level Prioritisation (Example) Historical search = Service Level 1 Compute node Compute thread Job queue � 23

Multi-level Prioritisation (Example) Historical search = Service Level 1 First task > priority (no degression factor applied) Compute node Compute thread Job queue � 24

Future Work ❖ Improve scheduling efficiency (predictable job runtime) ❖ Prevent job starvation (e.g. job-stealing SPIs) ❖ Make all algorithms scalable ❖ Pave way for Ignite Native Persistence � 25

Thank you! javi.carretero@trendminer.com

Interactive Historical Data Analysis Javi Carretero, Technical - PowerPoint PPT Presentation

Interactive Historical Data Analysis Javi Carretero, Technical Architect TrendMiner 04/06/19 Plan Introduce TrendMiner Discuss context & user needs TrendMiner 1.0 TrendMiner 2.0 (with Apache Ignite ) Challenges

Interactive Proofs Lecture 18 AM 1 Interactive Proofs 2 Interactive Proofs IP[k] 2

Data Formats Omayma Said Data Scientist DataCamp Interactive Data Visualization with rbokeh

Introducing the Bokeh Server Interactive Data Visualization with Bokeh Interactive Data

Interactive Data Visualization with Bokeh Interactive Data Visualization with Bokeh What is

Zero-Knowledge Proofs Lecture 15 Interactive Proofs Interactive Proofs Interactive Proofs

Historical Development Historical Development Historical Development Lesson No. 2 ENV H 471

Historical Treebanks The Penn Historical Corpora and the Icelandic Historical Parsed Corpus 1

Basics of Interactive Visual Analysis Helwig Hauser (Univ. of Bergen) Interactive Visual

Interim REPORT SEPNOV 2017 MAG INTERACTIVE AB (publ) MAG Interactive is a leading developer

Zero-Knowledge Proofs 1 Zero-Knowledge Proofs Lecture 15 1 Interactive Proofs 2 Interactive

Conversational Exploratory Search via Interactive Storytelling Outline 1. Interactive

Plot and Mapped Attributes (Part 1) Omayma Said Data Scientist DataCamp Interactive Data

Interactive traffic analysis and Interactive traffic analysis and visualization with Wisconsin

Data and Analysis Part V Statistical Analysis of Data Alex Simpson Part V: Statistical Analysis

Interactive Deep-dive : Visualizing Terrorism Data Ella Kim, Leah Kim Project Idea Evolution of

Dremel: Interactive Analysis of Web-Scale Datasets CS 744 BIG DATA PHIL MARTINKUS Motivation

Gamifying Your Giving Day How to build off the momentum of your Giving Day to motivate and

PLAN MANAGEMENT ADVISORY GROUP June 8, 2017 WELCOME AND AGENDA REVIEW ROB SPECTOR, CHAIR PLAN

Working Group Updates Vermont Dairy and Cross-sector group of 22 citizen- Water leaders.

Starting at 1pm Central A Few Quick things A video recording of this live webinar will be sent

Understanding the Genetic/Genomic Testing Strategy Ben Solomon, MD Chief, Division of Medical

Leadership Compass work styles Background All directions have profound strengths and

Congressional Request Adapting to the Impacts of Climate Change

DATA LIFECYCLE BUSINESS UNDERSTANDING VALIDATE & DESIGN & EXPLORE OPERATIONALIZE

Interactive Historical Data Analysis Javi Carretero, Technical - PowerPoint PPT Presentation

Interactive Historical Data Analysis Javi Carretero, Technical Architect TrendMiner 04/06/19 Plan Introduce TrendMiner Discuss context & user needs TrendMiner 1.0 TrendMiner 2.0 (with Apache Ignite ) Challenges

Interactive Proofs Lecture 18 AM 1 Interactive Proofs 2 Interactive Proofs IP[k] 2

Data Formats Omayma Said Data Scientist DataCamp Interactive Data Visualization with rbokeh

Introducing the Bokeh Server Interactive Data Visualization with Bokeh Interactive Data

Interactive Data Visualization with Bokeh Interactive Data Visualization with Bokeh What is

Zero-Knowledge Proofs Lecture 15 Interactive Proofs Interactive Proofs Interactive Proofs

Historical Development Historical Development Historical Development Lesson No. 2 ENV H 471

Historical Treebanks The Penn Historical Corpora and the Icelandic Historical Parsed Corpus 1

Basics of Interactive Visual Analysis Helwig Hauser (Univ. of Bergen) Interactive Visual

Interim REPORT SEPNOV 2017 MAG INTERACTIVE AB (publ) MAG Interactive is a leading developer

Zero-Knowledge Proofs 1 Zero-Knowledge Proofs Lecture 15 1 Interactive Proofs 2 Interactive

Conversational Exploratory Search via Interactive Storytelling Outline 1. Interactive

Plot and Mapped Attributes (Part 1) Omayma Said Data Scientist DataCamp Interactive Data

Interactive traffic analysis and Interactive traffic analysis and visualization with Wisconsin

Data and Analysis Part V Statistical Analysis of Data Alex Simpson Part V: Statistical Analysis

Interactive Deep-dive : Visualizing Terrorism Data Ella Kim, Leah Kim Project Idea Evolution of

Dremel: Interactive Analysis of Web-Scale Datasets CS 744 BIG DATA PHIL MARTINKUS Motivation

Gamifying Your Giving Day How to build off the momentum of your Giving Day to motivate and

PLAN MANAGEMENT ADVISORY GROUP June 8, 2017 WELCOME AND AGENDA REVIEW ROB SPECTOR, CHAIR PLAN

Working Group Updates Vermont Dairy and Cross-sector group of 22 citizen- Water leaders.

Starting at 1pm Central A Few Quick things A video recording of this live webinar will be sent

Understanding the Genetic/Genomic Testing Strategy Ben Solomon, MD Chief, Division of Medical

Leadership Compass work styles Background All directions have profound strengths and

Congressional Request Adapting to the Impacts of Climate Change

DATA LIFECYCLE BUSINESS UNDERSTANDING VALIDATE &amp; DESIGN &amp; EXPLORE OPERATIONALIZE

DATA LIFECYCLE BUSINESS UNDERSTANDING VALIDATE & DESIGN & EXPLORE OPERATIONALIZE