interactive historical data analysis
play

Interactive Historical Data Analysis Javi Carretero, Technical - PowerPoint PPT Presentation

Interactive Historical Data Analysis Javi Carretero, Technical Architect TrendMiner 04/06/19 Plan Introduce TrendMiner Discuss context & user needs TrendMiner 1.0 TrendMiner 2.0 (with Apache Ignite ) Challenges


  1. Interactive Historical Data Analysis Javi Carretero, Technical Architect 
 TrendMiner 
 04/06/19

  2. Plan ❖ Introduce TrendMiner ❖ Discuss context & user needs ❖ TrendMiner 1.0 ❖ TrendMiner 2.0 (with Apache Ignite ) ❖ Challenges ❖ Future work � 2

  3. TrendMiner Empower process and asset experts with advanced analytics to Analyse, Monitor and Predict the operational performance of batch, grade and continuous manufacturing processes. We democratise analytics by giving insights to the people who need answers: the engineers and operators in the plant. Data Science Domain Expertise � 3

  4. TrendMiner � 4

  5. TrendMiner Descriptive Analytics Time Series Predictive Analytics Modelling � 5

  6. TrendMiner Context & Scale ❖ > 300M points per time series ❖ 10-40K active time series ❖ Source of data is generally very slow! Responsiveness User Expectations ❖ Time to first result < 1s Overall performance ❖ Higher resolution ❖ More active time series ❖ More advanced analytics � 6

  7. TrendMiner 1.0 Focus on responsiveness (making TrendMiner more interactive)

  8. TrendMiner 1.0 TrendMiner Source File-based Streaming back to UI Not scalable Fast for small queries Algorithms Slow for big queries Not scalable � 8

  9. TrendMiner 2.0 Focus on performance (making TrendMiner more efficient)

  10. TrendMiner 2.0 - Caching t Time Series t 0 t N Time Slices t 2 t 3 t 4 t 0 t 1 t 5 t 6 t N IgniteCache<Key, Data> startDate (t i ) [ts 0 , value 0 ] endDate (t j ) [ts 1 , value 1 ] timeSeriesId [ts 2 , value 2 ] � 10

  11. TrendMiner 2.0 - Caching Time Slices S 2 S 3 S 4 S 1 S 5 S 6 S 7 2019-06-04 16:20:15.165 Point timestamp Slice by hour 2019-06-04 16:20:15.165 Scalability Performance 2019-06-04 16:20:15.165 Slice by day 2019-06-04 16:20:15.165 Slice by month 2019-06-04 16:20:15.165 Slice by year � 11

  12. TrendMiner 2.0 - Affinity Time Series X S 2 S 3 S 4 S 1 S 5 S 6 S 7 Time Series Y S 2 S 3 S 4 S 1 S 5 S 6 S 7 Node j S i (all Time Series) e.g: 2019-06-04 � 12

  13. TrendMiner 2.0 - Compute Grid Data Point Chronological swipe Data Point (meeting criteria) Existing algorithms S 2 S 3 S 4 S 1 S 5 S 6 S 7 S 1 S 4 Scalable algorithms = IgniteCompute S 3 S 7 A. Split search (affects N slices) B. Scatter "jobs" = affinityCall S 2 S 5 C. Chronological swipe (single slice) S 6 D. Post-process partial results (e.g. merging) � 13

  14. TrendMiner 2.0 - Result TrendMiner Source Streaming back to UI Fast for small queries Scalable Algorithms Fast for big queries Scalable � 14

  15. Challenge - Multi-level Prioritisation Search dimensions CPU usage ❖ Time Series (single vs multiple series) ❖ Search window (single vs multiple time slices) Ignite jobs ❖ Algorithm (visualisation vs descriptive vs predictive analytics) Urgency � 15

  16. Challenge - Multi-level Prioritisation Ignite Capabilities ❖ PriorityQueueCollisionSpi ( grid.task.priority ) = 1 dimension! MultiLevelPriorityQueueCollisionSpi (custom implementation) ❖ Still use grid.task.priority ❖ Priority degression factor = #(Time Series) X #(Time Slices) ❖ Urgency via "Service Levels" (0...N) = 2nd dimension � 16

  17. Multi-level Prioritisation (Example) Compute node Job queue � 17

  18. Multi-level Prioritisation (Example) Historical search = Service Level 1 Queued tasks have a priority (degression factor) Compute node Compute thread Job queue � 18

  19. Multi-level Prioritisation (Example) New computation (max urgency) = Service Level 0 Compute node Compute thread Job queue � 19

  20. Multi-level Prioritisation (Example) New computation (max urgency) = Service Level 0 Compute node Compute thread Job queue � 20

  21. Multi-level Prioritisation (Example) Compute node Compute thread Job queue � 21

  22. Multi-level Prioritisation (Example) Compute node Compute thread Job queue � 22

  23. Multi-level Prioritisation (Example) Historical search = Service Level 1 Compute node Compute thread Job queue � 23

  24. Multi-level Prioritisation (Example) Historical search = Service Level 1 First task > priority (no degression factor applied) Compute node Compute thread Job queue � 24

  25. Future Work ❖ Improve scheduling efficiency (predictable job runtime) ❖ Prevent job starvation (e.g. job-stealing SPIs) ❖ Make all algorithms scalable ❖ Pave way for Ignite Native Persistence � 25

  26. Thank you! javi.carretero@trendminer.com

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend