Investigating the Use of Bayesian Networks in the Hora Approach for - - PowerPoint PPT Presentation

investigating the use of bayesian networks in the hora
SMART_READER_LITE
LIVE PREVIEW

Investigating the Use of Bayesian Networks in the Hora Approach for - - PowerPoint PPT Presentation

Investigating the Use of Bayesian Networks in the Hora Approach for Component-based Online Failure Prediction Teerat Pitakrat, Andr van Hoorn University of Stuttgart Institute of Software Technology (ISTE) Reliable Software Systems (RSS)


slide-1
SLIDE 1

Investigating the Use of Bayesian Networks in the Hora Approach for Component-based Online Failure Prediction

Teerat Pitakrat, André van Hoorn

University of Stuttgart Institute of Software Technology (ISTE) Reliable Software Systems (RSS) Group Stuttgart, Germany

Nov 27, 2014 @ SOSP 2014, Stuttgart

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

1 / 29

slide-2
SLIDE 2

Service Failure

Motivation: Failure Management

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

2 / 29

slide-3
SLIDE 3

Service Failure

Motivation: Failure Management

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

2 / 29

slide-4
SLIDE 4

Service Failure

Motivation: Failure Management

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

2 / 29

slide-5
SLIDE 5

Service Failure

Motivation: Failure Management

“A service failure, often abbreviated here to failure, is an event that occurs when the delivered service deviates from correct service.”

— Avizienis et al. [2004]

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

3 / 29

slide-6
SLIDE 6

Reactive vs. Proactive Failure Mgmt.

Motivation: Failure Management

Reactive

Failure Failure detected Start recovery System recovered QoS

100% 0%

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

4 / 29

slide-7
SLIDE 7

Reactive vs. Proactive Failure Mgmt.

Motivation: Failure Management

Reactive

Failure Failure detected Start recovery System recovered QoS

100% 0%

Proactive

QoS Failure Failure predicted Prepare recovery System recovered

100% 0%

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

4 / 29

slide-8
SLIDE 8

Agenda

1

Motivation: Failure Management

2

[Recap] Hora: Online Failure Prediction for CB Systems

3

Hora: Framework and Implementation

4

Evaluation

5

Conclusion

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

5 / 29

slide-9
SLIDE 9

Related Approaches vs. Hora Approach

[Recap] Hora: Online Failure Prediction for CB Systems

Amin et al. [2012], Liang et al. [2007]

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

6 / 29

slide-10
SLIDE 10

Related Approaches vs. Hora Approach

[Recap] Hora: Online Failure Prediction for CB Systems

Bielefeld [2012]

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

6 / 29

slide-11
SLIDE 11

Related Approaches vs. Hora Approach

[Recap] Hora: Online Failure Prediction for CB Systems

Pitakrat [2013], Pitakrat et al. [2014b]

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

6 / 29

slide-12
SLIDE 12

Related Approaches vs. Hora Approach

[Recap] Hora: Online Failure Prediction for CB Systems

+

Component Dependency Component-level Prediction Models System-level Prediction Model

Pitakrat [2013], Pitakrat et al. [2014b]

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

6 / 29

slide-13
SLIDE 13

Hora: Component-level Prediction Models

[Recap] Hora: Online Failure Prediction for CB Systems

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

7 / 29

slide-14
SLIDE 14

Hora: Component-level Prediction Models

[Recap] Hora: Online Failure Prediction for CB Systems

x x x x x x x x x

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

7 / 29

slide-15
SLIDE 15

Hora: Framework Architecture

[Recap] Hora: Online Failure Prediction for CB Systems

Hora

System-level Predictor

Monitoring Reader

! !

Kieker, Weka, R, ESPER, ...

CDT

PAD HDD Failure Predictor Event Log Analyzer

Component-level Predictors

PCM SLAstic

...

Becker et al. [2009], Bielefeld [2012], Pitakrat et al. [2013; 2014a], van Hoorn [2014]

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

8 / 29

slide-16
SLIDE 16

Hora: Framework Architecture

[Recap] Hora: Online Failure Prediction for CB Systems

Hora

System-level Predictor

Monitoring Reader

! !

Kieker, Weka, R, ESPER, ...

CDT

PAD HDD Failure Predictor Event Log Analyzer

Component-level Predictors

PCM SLAstic

...

Questions:

1 What is a suitable model for System-level Prediction Model (SPM)? 2 How to transform architectural models to CDT and to SPM? 3 How does Hora improve online failure prediction?

Becker et al. [2009], Bielefeld [2012], Pitakrat et al. [2013; 2014a], van Hoorn [2014]

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

8 / 29

slide-17
SLIDE 17

Agenda

Hora: Framework and Implementation

1

Motivation: Failure Management

2

[Recap] Hora: Online Failure Prediction for CB Systems

3

Hora: Framework and Implementation

4

Evaluation

5

Conclusion

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

9 / 29

slide-18
SLIDE 18

Hora: Framework Architecture

Hora: Framework and Implementation

Hora

System-level Predictor

Monitoring Reader

! !

Kieker, Weka, R, ESPER, ...

CDT

PAD HDD Failure Predictor Event Log Analyzer

Component-level Predictors

PCM SLAstic

...

Questions:

1 What is a suitable model for System-level Prediction Model (SPM)? 2 How to transform architectural models to CDT and to SPM? 3 How does Hora improve online failure prediction?

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

10 / 29

slide-19
SLIDE 19

System-level Prediction Model

Hora: Framework and Implementation

Bayesian network: probabilistic graphical model

Bayesian network library used in Hora: https://github.com/kutschkem/Jayes

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

11 / 29

slide-20
SLIDE 20

System-level Prediction Model

Hora: Framework and Implementation

Bayesian network: probabilistic graphical model Rain Sprinkler Grass

Rain T F 0.2 0.8 Sprinkler T F 0.4 0.6 Rain T F 0.01 0.99 Grass wet T F 0.99 0.01 Rain T F 0.9 0.1 Sprinkler T T T F F F 0.8 0.2 0.0 1.0

Bayesian network library used in Hora: https://github.com/kutschkem/Jayes

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

11 / 29

slide-21
SLIDE 21

System-level Prediction Model

Hora: Framework and Implementation

Bayesian network: probabilistic graphical model

0.2 0.8 0.4 0.6 0.01 0.99 0.99 0.01 0.9 0.1 0.8 0.2 0.0 1.0

Bayesian network library used in Hora: https://github.com/kutschkem/Jayes

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

11 / 29

slide-22
SLIDE 22

Hora: Framework Architecture

Hora: Framework and Implementation

Hora

System-level Predictor

Monitoring Reader

! !

Kieker, Weka, R, ESPER, ...

CDT

PAD HDD Failure Predictor Event Log Analyzer

Component-level Predictors

PCM SLAstic

...

Questions:

1 What is a suitable model for System-level Prediction Model (SPM)? 2 How to transform architectural models to CDT and to SPM? 3 How does Hora improve online failure prediction?

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

12 / 29

slide-23
SLIDE 23

Component Dependency Table

Component Dependencies

Hora: Framework and Implementation

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

13 / 29

slide-24
SLIDE 24

Component Dependency Table

Component Dependencies

Hora: Framework and Implementation

C1 C2 C3 H1 H2 C1

  • C2
  • C3
  • H1

H2

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

13 / 29

slide-25
SLIDE 25

Component Dependency Table

Dependency Calling Probability

Hora: Framework and Implementation

C1 C2 C3 H1 H2 C1 0.5 0.5 1.0 C2 0.5 1.0 C3 1.0 H1 H2

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

14 / 29

slide-26
SLIDE 26

System-level Prediction Model

Hora: Framework and Implementation

C1 C2 C3 H1 H2 C1 0.5 0.5 1.0 C2 0.5 1.0 C3 1.0 H1 H2

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

15 / 29

slide-27
SLIDE 27

System-level Prediction Model

Hora: Framework and Implementation

C1 C2 C3 H1 H2 C1 0.5 0.5 1.0 C2 0.5 1.0 C3 1.0 H1 H2

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

15 / 29

slide-28
SLIDE 28

System-level Prediction Model

Hora: Framework and Implementation

C1 C2 C3 H1 H2 C1 0.5 0.5 1.0 C2 0.5 1.0 C3 1.0 H1 H2 H2 ✓ ✗ 0.8 0.2

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

15 / 29

slide-29
SLIDE 29

System-level Prediction Model

Hora: Framework and Implementation

C1 C2 C3 H1 H2 C1 0.5 0.5 1.0 C2 0.5 1.0 C3 1.0 H1 H2 C3 H2 ✓ ✗ ✓ 0.9 0.1 ✗ 0.0 1.0

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

15 / 29

slide-30
SLIDE 30

System-level Prediction Model

Hora: Framework and Implementation

C1 C2 C3 H1 H2 C1 0.5 0.5 1.0 C2 0.5 1.0 C3 1.0 H1 H2 C2 H1 C3 ✓ ✗ ✓ ✓ 0.9 0.1 ✓ ✗ 0.5 0.5 ✗ ✓ 0.0 1.0 ✗ ✗ 0.0 1.0

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

15 / 29

slide-31
SLIDE 31

Hora: Framework Architecture

Hora: Framework and Implementation

Hora

System-level Predictor

Monitoring Reader

! !

Kieker, Weka, R, ESPER, ...

CDT

PAD HDD Failure Predictor Event Log Analyzer

Component-level Predictors

PCM SLAstic

...

Questions:

1 What is a suitable model for System-level Prediction Model (SPM)? 2 How to transform architectural models to CDT and to SPM? 3 How does Hora improve online failure prediction?

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

16 / 29

slide-32
SLIDE 32

Agenda

Evaluation

1

Motivation: Failure Management

2

[Recap] Hora: Online Failure Prediction for CB Systems

3

Hora: Framework and Implementation

4

Evaluation

5

Conclusion

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

17 / 29

slide-33
SLIDE 33

Experiment Setup

Evaluation

  • System under analysis
  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

18 / 29

slide-34
SLIDE 34

Experiment Setup

Evaluation

  • System under analysis
  • Fault injection
  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

18 / 29

slide-35
SLIDE 35

Experiment Setup

Evaluation

  • System under analysis
  • Fault injection
  • Predicting response time violation using time series forecasting

Response time

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

18 / 29

slide-36
SLIDE 36

Operation Dependency Graph

Evaluation

<<execution container>> jpetstore.horajpetstore.rssperf.emulab.net <<deployment component>> @1:org.apache.ibatis.executor.CachingExecutor <<deployment component>> @2:org.mybatis.jpetstore.persistence.AccountMapper <<deployment component>> @3:org.mybatis.jpetstore.service.AccountService <<deployment component>> @4:org.mybatis.jpetstore.persistence.ProductMapper <<deployment component>> @5:org.mybatis.jpetstore.service.CatalogService <<deployment component>> @6:org.mybatis.jpetstore.web.actions.AccountActionBean <<deployment component>> @7:org.mybatis.jpetstore.persistence.CategoryMapper <<deployment component>> @8:org.mybatis.jpetstore.web.actions.CatalogActionBean <<deployment component>> @9:org.mybatis.jpetstore.persistence.ItemMapper <<deployment component>> @10:org.mybatis.jpetstore.web.actions.CartActionBean <<deployment component>> @11:org.mybatis.jpetstore.web.actions.OrderActionBean <<deployment component>> @12:org.mybatis.jpetstore.persistence.SequenceMapper <<deployment component>> @13:org.mybatis.jpetstore.persistence.OrderMapper <<deployment component>> @14:org.mybatis.jpetstore.persistence.LineItemMapper <<deployment component>> @15:org.mybatis.jpetstore.service.OrderService 'Entry' searchProductList(..) 40 signon() 199561 viewProduct() 798150 viewCategory() 199556 addItemToCart() 399060 newOrder() 399023 query(..) update(..) commit(..) getAccountByUsernameAndPassword(..) 199561 199561 getAccount(..) 199561 searchProductList(..) 40 40 getProduct(..) 798150 798150 getProductListByCategory(..) 399117 399117 getProduct(..) 798150 40 getItemListByProduct(..) getItemListByProduct(..) 798150 getItem(..) getItem(..) 199537 isItemInStock(..) getInventoryQuantity(..) 199537 getCategory(..) getCategory(..) 199556 getProductListByCategory(..) 399117 199561 199561 199556 199556 798150 798150 199556 199556 updateInventoryQuantity(..) 199511 199537 199537 798150 798150 199537 199537 199537 199537 insertOrder(..) 199511 updateSequence(..) 199511 getSequence(..) 199511 insertOrderStatus(..) 199511 insertOrder(..) 199511 insertLineItem(..) 199511 199511 199511 199511 199511 199511 199511 199511

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

19 / 29

slide-37
SLIDE 37

SPM for JPetStore

Evaluation

ItemMapper-3-getItemListByProduct CatalogService-4-getItemListByProduct CatalogActionBean-6-viewProduct net CatalogService-4-getProductListByCategory AccountActionBean-12-signon AccountService-11-getAccount OrderMapper-7-insertOrder OrderService-9-insertOrder CachingExecutor-1-commit SequenceMapper-2-getSequence SequenceMapper-2-updateSequence OrderMapper-7-insertOrderStatus LineItemMapper-8-insertLineItem ItemMapper-3-updateInventoryQuantity OrderActionBean-0-newOrder CachingExecutor-1-query CategoryMapper-14-getCategory CatalogService-4-getCategory CachingExecutor-1-update AccountMapper-10-getAccountByUsernameAndPassword CatalogService-4-isItemInStock CartActionBean-13-addItemToCart CatalogService-4-getItem CatalogActionBean-6-viewCategory ProductMapper-5-getProductListByCategory ProductMapper-5-searchProductList ItemMapper-3-getInventoryQuantity ProductMapper-5-getProduct CatalogService-4-getProduct ItemMapper-3-getItem CatalogService-4-searchProductList net-cpu1 net-cpu0 net-memSwap
  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

20 / 29

slide-38
SLIDE 38

SPM for JPetStore

Evaluation

ItemMapper-3-getItemListByProduct CatalogService-4-getItemListByProduct CatalogActionBean-6-viewProduct net CatalogService-4-getProductListByCategory AccountActionBean-12-signon AccountService-11-getAccount OrderMapper-7-insertOrder OrderService-9-insertOrder CachingExecutor-1-commit SequenceMapper-2-getSequence SequenceMapper-2-updateSequence OrderMapper-7-insertOrderStatus LineItemMapper-8-insertLineItem ItemMapper-3-updateInventoryQuantity OrderActionBean-0-newOrder CachingExecutor-1-query CategoryMapper-14-getCategory CatalogService-4-getCategory CachingExecutor-1-update AccountMapper-10-getAccountByUsernameAndPassword CatalogService-4-isItemInStock CartActionBean-13-addItemToCart CatalogService-4-getItem CatalogActionBean-6-viewCategory ProductMapper-5-getProductListByCategory ProductMapper-5-searchProductList ItemMapper-3-getInventoryQuantity ProductMapper-5-getProduct CatalogService-4-getProduct ItemMapper-3-getItem CatalogService-4-searchProductList net-cpu1 net-cpu0 net-memSwap

OrderActionBean-0-newOrder OrderService-9-insertOrder CachingExecutor-1-update LineItemMapper-8-insertLineItem OrderMapper-7-insertOrder OrderMapper-7-insertOrderStatus SequenceMapper-2-updateSe

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

20 / 29

slide-39
SLIDE 39

Component Failure Prediction

CachingExecutor-1-update

Evaluation

CPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06

CachingExecutor−1−update

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

SPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06

CachingExecutor−1−update

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

21 / 29

slide-40
SLIDE 40

Component Failure Prediction

CachingExecutor-1-update

Evaluation

CPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06

CachingExecutor−1−update

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

SPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06

CachingExecutor−1−update

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

21 / 29

slide-41
SLIDE 41

Component Failure Prediction

CachingExecutor-1-update

Evaluation

CPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06

CachingExecutor−1−update

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27

  • 0.0

0.2 0.4 0.6 0.8 1.0 Failure probability Monitoring data Failure threshold Failure probability

SPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06

CachingExecutor−1−update

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27

  • 0.0

0.2 0.4 0.6 0.8 1.0 Failure probability Monitoring data Failure threshold Failure probability

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

21 / 29

slide-42
SLIDE 42

Component Failure Prediction

CachingExecutor-1-update

Evaluation

CPM

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

CachingExecutor−1−update ROC Curve

False positive rate True positive rate

SPM

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

CachingExecutor−1−update ROC Curve

False positive rate True positive rate

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

21 / 29

slide-43
SLIDE 43

SPM for JPetStore

Evaluation

OrderActionBean-0-newOrder OrderService-9-insertOrder CachingExecutor-1-update LineItemMapper-8-insertLineItem OrderMapper-7-insertOrder OrderMapper-7-insertOrderStatus SequenceMapper-2-updateSe

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

22 / 29

slide-44
SLIDE 44

Component vs. System Failure Prediction

OrderMapper-7-insertOrderStatus

Evaluation

CPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06 6e+06

OrderMapper−7−insertOrderStatus

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

SPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06 6e+06

OrderMapper−7−insertOrderStatus

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

23 / 29

slide-45
SLIDE 45

Component vs. System Failure Prediction

OrderMapper-7-insertOrderStatus

Evaluation

CPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06 6e+06

OrderMapper−7−insertOrderStatus

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

SPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06 6e+06

OrderMapper−7−insertOrderStatus

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

23 / 29

slide-46
SLIDE 46

Component vs. System Failure Prediction

OrderMapper-7-insertOrderStatus

Evaluation

CPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06 6e+06

OrderMapper−7−insertOrderStatus

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27

  • 0.0

0.2 0.4 0.6 0.8 1.0 Failure probability Monitoring data Failure threshold Failure probability

SPM

  • 0e+00

1e+06 2e+06 3e+06 4e+06 5e+06 6e+06

OrderMapper−7−insertOrderStatus

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27

  • 0.0

0.2 0.4 0.6 0.8 1.0 Failure probability Monitoring data Failure threshold Failure probability

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

23 / 29

slide-47
SLIDE 47

Component vs. System Failure Prediction

OrderMapper-7-insertOrderStatus

Evaluation

CPM

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

OrderMapper−7−insertOrderStatus ROC Curve

False positive rate True positive rate

SPM

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

OrderMapper−7−insertOrderStatus ROC Curve

False positive rate True positive rate

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

23 / 29

slide-48
SLIDE 48

SPM for JPetStore

Evaluation

OrderActionBean-0-newOrder OrderService-9-insertOrder CachingExecutor-1-update LineItemMapper-8-insertLineItem OrderMapper-7-insertOrder OrderMapper-7-insertOrderStatus SequenceMapper-2-updateSe

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

24 / 29

slide-49
SLIDE 49

Component vs. System Failure Prediction

OrderService-9-insertOrder

Evaluation

CPM

  • 0e+00

1e+07 2e+07 3e+07 4e+07

OrderService−9−insertOrder

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

SPM

  • 0e+00

1e+07 2e+07 3e+07 4e+07

OrderService−9−insertOrder

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

25 / 29

slide-50
SLIDE 50

Component vs. System Failure Prediction

OrderService-9-insertOrder

Evaluation

CPM

  • 0e+00

1e+07 2e+07 3e+07 4e+07

OrderService−9−insertOrder

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

SPM

  • 0e+00

1e+07 2e+07 3e+07 4e+07

OrderService−9−insertOrder

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 Monitoring data Failure threshold Failure probability

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

25 / 29

slide-51
SLIDE 51

Component vs. System Failure Prediction

OrderService-9-insertOrder

Evaluation

CPM

  • 0e+00

1e+07 2e+07 3e+07 4e+07

OrderService−9−insertOrder

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27

  • 0.0

0.2 0.4 0.6 0.8 1.0 Failure probability Monitoring data Failure threshold Failure probability

SPM

  • 0e+00

1e+07 2e+07 3e+07 4e+07

OrderService−9−insertOrder

Time Response time (ns) 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27 12:43 13:11 13:39 14:07 14:35 15:03 15:31 15:59 16:27

  • 0.0

0.2 0.4 0.6 0.8 1.0 Failure probability Monitoring data Failure threshold Failure probability

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

25 / 29

slide-52
SLIDE 52

Discussion of Preliminary Results

Evaluation

  • CPMs perform quite well but can still be improved
  • SPM shows potential to predict failure propagation using component

dependency

  • Although the failure propagates to other components, the predictions are

not very good as the failure thresholds are not set properly

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

26 / 29

slide-53
SLIDE 53

Agenda

Conclusion

1

Motivation: Failure Management

2

[Recap] Hora: Online Failure Prediction for CB Systems

3

Hora: Framework and Implementation

4

Evaluation

5

Conclusion

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

27 / 29

slide-54
SLIDE 54

Summary

Conclusion

+

Component Dependency Component-level Prediction Models System-level Prediction Model

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

28 / 29

slide-55
SLIDE 55

Summary

Conclusion

+

Component Dependency Component-level Prediction Models System-level Prediction Model

C1 C2 C3 H1 H2 C1 0.5 0.5 1.0 C2 0.5 1.0 C3 1.0 H1 H2

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

28 / 29

slide-56
SLIDE 56

Summary

Conclusion

+

Component Dependency Component-level Prediction Models System-level Prediction Model

C1 C2 C3 H1 H2 C1 0.5 0.5 1.0 C2 0.5 1.0 C3 1.0 H1 H2

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

28 / 29

slide-57
SLIDE 57

Summary

Conclusion

+

Component Dependency Component-level Prediction Models System-level Prediction Model

C1 C2 C3 H1 H2 C1 0.5 0.5 1.0 C2 0.5 1.0 C3 1.0 H1 H2

C2 H1 C3 ✓ ✗ ✓ ✓ 0.9 0.1 ✓ ✗ 0.5 0.5 ✗ ✓ 0.0 1.0 ✗ ✗ 0.0 1.0

  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

28 / 29

slide-58
SLIDE 58

Next Steps

Conclusion

  • Improve prediction of CPMs
  • Apply different prediction techniques
  • Calibrate the configuration parameters of Hora
  • Failure threshold
  • Lead time
  • . . .
  • Extend evaluation settings
  • Evaluate with distributed lab study
  • Evaluate with large-scale production systems
  • Develop a failure injection framework/benchmark
  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

29 / 29

slide-59
SLIDE 59

Literature

  • A. Amin, A. Colman, and L. Grunske. An approach to forecasting QoS attributes of web services based on ARIMA and GARCH models. In Proceedings of

the 19th IEEE International Conference on Web Services, pages 74–81, June 2012.

  • A. Avizienis, J.-C. Laprie, B. Randell, and C. Landwehr. Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on

Dependable and Secure Computing, 1(1):11–33, 2004. ISSN 1545-5971. doi: 10.1109/TDSC.2004.2.

  • S. Becker, H. Koziolek, and R. Reussner. The Palladio component model for model-driven performance prediction. Journal of Systems and Software, 82(1):

3–22, 2009.

  • T. C. Bielefeld. Online performance anomaly detection for large-scale software systems. Master’s thesis, Mar. 2012. Diploma Thesis, Kiel University.
  • Y. Liang, Y. Zhang, H. Xiong, and R. K. Sahoo. Failure prediction in ibm bluegene/l event logs. In Proceedings of the 7th IEEE International Conference on

Data Mining, pages 583–588, 2007.

  • T. Pitakrat. Hora: Online failure prediction framework for component-based software systems based on kieker and palladio. In Proceedings of the

Symposium on Software Performance: Joint Kieker/Palladio Days 2013, CEUR Workshop Proceedings, 2013.

  • T. Pitakrat, A. van Hoorn, and L. Grunske. A comparison of machine learning algorithms for proactive hard disk drive failure detection. In Proceedings of the

4th International ACM Sigsoft Symposium on Architecting Critical Systems, pages 1–10. ACM, 2013.

  • T. Pitakrat, J. Grunert, O. Kabierschke, F. Keller, and A. van Hoorn. A framework for system event classification and prediction by means of machine
  • learning. In Proceedings of the 8th International Conference on Performance Evaluation Methodologies and Tools (ValueTools 2014), 2014a.
  • T. Pitakrat, A. van Hoorn, and L. Grunske. Increasing dependability of component-based software systems by online failure prediction. In Proceedings of the

10th European Dependable Computing Conference (EDCC), pages 66–69, May 2014b.

  • A. van Hoorn. Model-Driven Online Capacity Management for Component-Based Software Systems. PhD thesis, Kiel, Germany, 2014. Dissertation, Faculty
  • f Engineering, Kiel University.
  • T. Pitakrat, A. van Hoorn

Investigating the Use of Bayesian Networks in Hora for C-B OFP

  • Nov. 27, 2014 @ SOSP 2014

30 / 29