Quantifying and Detecting Incidents in IoT Big Data Analytics - - PowerPoint PPT Presentation

quantifying and detecting incidents in
SMART_READER_LITE
LIVE PREVIEW

Quantifying and Detecting Incidents in IoT Big Data Analytics - - PowerPoint PPT Presentation

Quantifying and Detecting Incidents in IoT Big Data Analytics Hong-Linh Truong Faculty of Informatics, TU Wien, Austria hong-linh.truong@tuwien.ac.at http://rdsea.github.io Acknowledgment: with a lot of discussion with Manfred Halper and our


slide-1
SLIDE 1

Quantifying and Detecting Incidents in IoT Big Data Analytics

Hong-Linh Truong

Faculty of Informatics, TU Wien, Austria hong-linh.truong@tuwien.ac.at http://rdsea.github.io Acknowledgment: with a lot of discussion with Manfred Halper and our industrial partners. Note: Ongoing work

Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong 1

slide-2
SLIDE 2

Public cloud infrastructures Private cloud infrs. Base Transceiver Station (BTS)

Case Study BTS

  • Large-scale systems (1K+ BTS)
  • Flexible back-end clouds
  • Generic enough for other applications (e.g., in smart agriculture)
  • With bad infrastructures for IoT and connectivity

Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong 2

Sensor IoT Gateway MQTT Broker BigQuery Influxdb Hadoop FS

  • G. Storage

Actuator Optimizer Analytics Analytics Analytics

slide-3
SLIDE 3

Challenges

The ultimate goal of the (domain) data scientist is to meet Quality of Analytics (QoA) QoA: cost, performance (response time), quality of data (up-to-date ness, accuracy)

(Remember Christoph Quix’s talk about quality)

But there are many interactions that might cause incidents

Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong 3

Hong-Linh Truong , Aitor Murguzur, Erica Yang, Challenges in Enabling Quality of Analytics in the Cloud, ACM JDIQ Challenge paper, 2017.

slide-4
SLIDE 4

Apache Nifi Big data storage (Hadoop FS/Google Storage) analytics result BTS Monitoring SFTP Apache Spark Enrichment Service Kibana Visualization analytics analytics result result result result ElasticSearch result result result result result data notification analytics results Web services Client BTS Monitoring MQTT RabbitMQ BatchAnalytics Manager Analytics Web Service Planner Streaming Data Processing Ingestion Service BigQuery Analytics Service

Problem 1: the complexity of software stacks and subsystems

Source: Simplified version of the design from I & A Computing Lab, VN www.inacomputing.com Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong 4

slide-5
SLIDE 5

Porblem 2: Complexity of the underlying virtual computing and network infrastructures

  • Heavily based on virtual resources
  • IoT, Network functions and Clouds
  • (Remember Manfred Hauswirth’s talk yesterday about fog/edge computing

and NFV/5G networks)

Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong 5

IoT Big Data Analytics

The SINC Concept: http://sincconcept.github.io

slide-6
SLIDE 6

Problem 3: Elasticity Management

6

Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong

Tien-Dung Nguyen, Hong Linh Truong, Georgiana Copil, Duc-Hung Le, Daniel Moldovan, Schahram Dustdar: On Developing and Operating of Data Elasticity Management Process. ICSOC 2015: 105-119

slide-7
SLIDE 7

Our ideas for incident monitoring and analytics

  • Classification of incidents:
  • to quantify incidents and identify possible data

sources, monitoring techniques and analytics.

  • Measurement/Instrumentation:
  • to provide mechanisms for measurement and data

collection for incidents.

  • Incident analytics:
  • to find out the root cause and dependencies of

incidents.

Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong 7

Hong Linh Truong, Manfred Halper: Classifying Incidents in Cloud-based IoT Big Data Analytics, Working paper, 2017.

slide-8
SLIDE 8

Analysis/ Transformation Task

IoT Sensor

Data Storage

Resulting analytics Message Broker/Data Logistics Service

….

Large number

  • f data

sources (e.g., IoT devices) Large-scale brokers & data transfer/logistics services Complex big data processing frameworks Other systems in the pipeline

IoT Gateway

Analysis/ Transformation Task

W3H: what, when, where and how for incidents

Too complex with many types of software. Can we have a simplified taxonomy for mapping incidents?

Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong 8

Hong Linh Truong, Manfred Halper: Classifying Incidents in Cloud-based IoT Big Data Analytics, Working paper, 2017.

slide-9
SLIDE 9

VM/Container (Provider i) VM/Container (Provider j) VM/Container (Provider k) Analysis Task

ML Library Data Source

Original data

Data Storage

data at rest In-processing data

Resulting data

Data Broker

data in motion

….

Large number of data sources (e.g., IoT) Large-scale brokers and storage Complex big data processing frameworks and ML applications (e.g., Spark) Other systems in the pipeline

Points of data collections for incident detection Data Source Data Source

Monitoring and Analytics

Not just fast, distributed and cross layer monitoring  Hard to collect some incident related data for quality of data Analytics: will be based on big data principles with ML but dependency analysis is not trivial

Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong 9

slide-10
SLIDE 10

Thanks for your attention!

Hong-Linh Truong Faculty of Informatics TU Wien rdsea.github.io

Dagstuhl Seminar 17441, Big Stream Processing Systems, 31 Oct, 2017. (c) Hong-Linh Truong 10