Berkeley/Stanford Recovery-oriented October 25, 2001 Computing - PDF document

Berkeley/Stanford Recovery-oriented October 25, 2001 Computing Course Lecture Problem definition When bad things happen to good M Detection: determining that a problem has (will) systems: detecting and diagnosing occur(red) problems M Diagnosis: determining the root cause of the problem M “Problem” can be broadly defined 3 2 – Performance-related, availability-related, security-related 1 0 M Fields to draw from: -1 -2 – System administration, operating systems, network management, 0 10 20 30 intrusion detection Kimberly Keeton M Techniques borrowed from: HPL Storage and Content Distribution – Statistics, database data mining, AI machine learning Berkeley/Stanford Recovery-oriented Computing Course Lecture October 18, 2001 Hewlett-Packard Laboratories Hewlett-Packard 2001-10-ROC-Lecture, 1 2001-10-ROC-Lecture Laboratories Storage & Content Distribution Outline Challenges in detecting problems M Problem definition M Many types of faults – Persistent increase, gradual change, abrupt change, single spike M Detection techniques M Time-varying property of observed system behavior – Challenges – Change point detection – Trends and seasonality (i.e., cyclic behavior) – Time series analysis M Distinguishing between the “good,” the “bad” and the – Predictive detection “ugly” – Data mining/machine learning algorithms M Detecting problems fast enough to minimize service M Diagnosis techniques disruption M Additional related work M Catching false positives vs. neglecting true positives M Summary Hewlett-Packard Hewlett-Packard 2001-10-ROC-Lecture, 2 2001-10-ROC-Lecture, 3 Laboratories Laboratories Storage & Content Distribution Storage & Content Distribution Change point detection algorithms [Hellerstein98] Maximum likelihood ratio M Basic idea: M Let Y 1 , Y 2 , … Y T be i.i.d. random variables – Determine when process parameters have changed M Let f(Y i , θ θ ) be the probability distribution function (pdf) of – Declare change point if I/O response time is “more likely” to have the random variables, where θ θ is the only parameter in the come from a distribution with a different mean pdf 6 M Let f( θ θ o ) and f( θ θ 1 ) be different distributions 5 4 M Likelihood ratio: T 3 ∏ θ f ( Y ) 2 i , 1 = 1 i 1 0 T ∏ θ f ( Y ) -1 i , 0 -2 i = 1 -3 M Large ratio => more likely Y 1 , Y 2 , … Y T from f( θ θ 1 ) M Ex: maximum likelihood ratio detection rules, such as cumulative sum (CUMSUM) Hewlett-Packard Hewlett-Packard 2001-10-ROC-Lecture, 4 2001-10-ROC-Lecture, 5 Laboratories Laboratories Storage & Content Distribution Storage & Content Distribution 1

Berkeley/Stanford Recovery-oriented October 25, 2001 Computing Course Lecture Maximum likelihood ratio detection rule CUMSUM example M Declare a change has occurred at N if the likelihood ratio • Raw data: difficult to after the change exceeds a pre-determined threshold level c detect change     n   θ ∏ f ( Y ) i , 1   • CUMSUM: easier to = i k = ≥ ≥ N inf  n 1 : sup c  detect change n   1 ≤ k ≤ n ∏ f ( Y θ ) , 0  i  i = k     • CUMSUM confidence M Ex: CUMSUM rule for normal random variables level    n  M Confidence level compared with bootstrapping (random permutation ∑ N = inf n ≥ 1 : max ( Y − Y ) ≥ c   i of data) 1 ≤ k ≤ n   i = k   – Bootstrap: flat cumulative residuals – CUMSUM: angle forms at change point Hewlett-Packard Hewlett-Packard 2001-10-ROC-Lecture, 6 2001-10-ROC-Lecture, 7 Laboratories Laboratories Storage & Content Distribution Storage & Content Distribution Change point pros/cons Outline M Advantages: M Problem definition – Well-established statistical technique M Detection techniques – Several variants of on-line and off-line algorithms – Challenges – Change point detection – Time series analysis M Disadvantages: – Predictive detection – Focuses on single type of fault – abrupt changes – Data mining/machine learning algorithms – Mostly limited to stationary (non-varying over time) processes M Diagnosis techniques • Must separately deal with long-term trends and seasonality – Some dependence on knowledge of and assumptions of data M Additional related work distributions M Summary Hewlett-Packard Hewlett-Packard 2001-10-ROC-Lecture, 8 2001-10-ROC-Lecture, 9 Laboratories Laboratories Storage & Content Distribution Storage & Content Distribution Time series forecasting algorithms Holt-Winters measure of deviation M Basic idea: M Confidence bands to measure deviation in seasonal cycle: – Build model of what you expect next observation to be, and raise alarm if observed and predicted values differ too much – predicted deviation: d t = γ γ |y t – y’ t | + (1 – γ γ )(d t-m ) M Ex: Holt-Winters forecasting [Hoogenboom93, Brutlag00] – confidence band: (y’ t – δ δ · d t-m , y’ t + δ δ · d t-m ) – 3-part model built on exponential smoothing: M Trigger alarm when number of violations exceeds – prediction = baseline + linear trend + seasonal effect threshold • y’ t+1 = a t + b t + c t+1-m – To reduce false alarm rate, measure across moving, fixed- • baseline: a t = α α (y t – c t-m ) + (1 – α α )(a t-1 + b t-1 ) sized window • linear trend: b t = β β (a t – a t-1 ) + (1 – β β )(b t-1 ) • seasonal trend: c t = γ γ (y t – a t ) + (1 – γ γ )(c t-m ) • where m is period of seasonal cycle Hewlett-Packard Hewlett-Packard 2001-10-ROC-Lecture, 10 2001-10-ROC-Lecture, 11 Laboratories Laboratories Storage & Content Distribution Storage & Content Distribution 2

Berkeley/Stanford Recovery-oriented October 25, 2001 Computing Course Lecture Time series forecasting pros/cons Holt-Winters example 1 LU read experiment - faultlu only M Advantages: 0.035 Response time (seconds) – Well-established statistical technique 0.03 – Considers time-varying properties of data 0.025 • Trends and seasonality (at many levels) 0.02 observations 0.015 lowerBound upperBound 0.01 M Disadvantages: 0.005 0 – Large number of parameters to tune for algorithm to work 0 20 40 60 80 -0.005 correctly Time (minutes) – Detection of problem after it occurs may imply service disruption M Simplified Holt-Winters: exponential smoothing M Generally detects 10-minute changes – Violations occur when observation falls outside of lower and upper bounds Hewlett-Packard Hewlett-Packard 2001-10-ROC-Lecture, 12 2001-10-ROC-Lecture, 13 Laboratories Laboratories Storage & Content Distribution Storage & Content Distribution Outline Predictive detection [Hellerstein00] M Problem definition M Basic idea: – Predict probability of violations of threshold tests in advance, M Detection techniques including how long until violation – Challenges – Change point detection – Allows pre-emptive corrective action in advance of service – Time series analysis disruption – Predictive detection – Data mining/machine learning algorithms – Also allows service providers to give customers advanced notice of M Diagnosis techniques potential service degradations M Additional related work M Summary Hewlett-Packard Hewlett-Packard 2001-10-ROC-Lecture, 14 2001-10-ROC-Lecture, 15 Laboratories Laboratories Storage & Content Distribution Storage & Content Distribution Predictive detection highlights Predictive detection example M Model both stationary and nonstationary effects 10 10 9 – Stationary: multi-part model using ANOVA techniques 9 Metric of interest Transformed metric of 8 8 7 – Non-stationary: use auto-correlation and auto-regression to 7 interest 6 6 Data capture short-range dependencies 5 5 Threshold 4 4 3 M Use observed data and models to predict future 3 2 2 1 transformed values for a prediction horizon 1 0 0 t-2 t-1 t t+1 t+2 t+3 M Calculate the probability that threshold is violated at each t-2 t-1 t t+1 t+2 t+3 Time Time point in the prediction horizon M Transform data and thresholds M May consider both upper and lower thresholds – Measured (time-varying) values are transformed into (stationary) values – Constant raw threshold also transformed into (time-varying) thresholds M Predict future values and probability of threshold violation Hewlett-Packard Hewlett-Packard 2001-10-ROC-Lecture, 16 2001-10-ROC-Lecture, 17 Laboratories Laboratories Storage & Content Distribution Storage & Content Distribution 3

Berkeley/Stanford Recovery-oriented October 25, 2001 Computing - PDF document

Berkeley/Stanford Recovery-oriented October 25, 2001 Computing Course Lecture Problem definition When bad things happen to good M Detection: determining that a problem has (will) systems: detecting and diagnosing occur(red) problems M

Atos Origin Year 2001 1st Half Results 1st Half 2001 Results September 2001 Agenda 2 1st

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

Principles and Practices of Recovery- Oriented Care Can Clinical Care be Recovery-Oriented?

Strip Recovery: Strip Recovery: Strip Recovery: Strip Recovery: A 12 A 12- -Step

Queen Victoria Street Precinct Stanford A Collaborative Project by Stanford Tourism Stanford

AIM Interim Report AIM Interim Report (2000 2000 - - 2001) 2001) ( March, 2001 1 March,

2001 I I NTERIM NTERIM R R ESULTS ESULTS P P RESENTATION 2001 RESENTATION 17th September 2001

Modern techniques for transaction- oriented database recovery Caetano Sauer My pleas 1.Demand

Community Recovery Forum Presenter: Cr Mary Brown Overview of Recovery Structure

RECOVERY OPERATIONS Performing recovery and related operations Acronis Training and Certification

Continuity and Recovery Planning Continuity and Recovery Planning Continuity and Recovery

Contents What is Recovery? What is Better Recovery? What is Community

From Recovery Strategy to Recovery Framework Session Outline Why a Recovery Framework 1 2 What

Recovery Lizzie Jacobs GBRT Sport Science Intern 1 What we will cover What is recovery

Stanford Microfluidics Microfluidics Lab Lab Stanford Juan G. Santiago Research Examples:

Assessing the Gains from E-Commerce Paul Dolfen, Stanford Liran Einav, Stanford and NBER Pete

How Green is Multipath TCP for Mobile Devices? Yeon-sup Lim 1 , Yung-Chih Chen 1 , Erich M. Nahum

Forecasting Methodologies Dave Appleby Types Standard Erlang-C Holt-Winters ARIMA So

Large systems of diffusions interacting through their ranks Mykhaylo Shkolnikov INTECH

Exponential decay estimates for fundamental solutions of Schr odinger-type operators Svitlana

STAT 213 Simple Linear Regression I Colin Reimer Dawson Oberlin College 5 October 2016 Outline

Welcome Back! EDUC 7610 Chapter 2 The Simple Regression Model Fall 2018 Tyson S. Barrett,

2.4 OLS: Goodness of Fit and Bias ECON 480 Econometrics Fall 2020 Ryan Safner

CEE 697K ENVIRONMENTAL REACTION KINETICS Lecture #18 Chloramines with Surface Reactions: Pipe