STK 4290 PARAMETRIC LIFETIME MODELING Slides 1 Bo Lindqvist - - PowerPoint PPT Presentation

stk 4290 parametric lifetime modeling
SMART_READER_LITE
LIVE PREVIEW

STK 4290 PARAMETRIC LIFETIME MODELING Slides 1 Bo Lindqvist - - PowerPoint PPT Presentation

STK 4290 PARAMETRIC LIFETIME MODELING Slides 1 Bo Lindqvist Department of Mathematical Sciences Norwegian University of Science and Technology Trondheim http://www.math.ntnu.no/ bo/ bo@math.ntnu.no University of Oslo, Spring 2014 Bo


slide-1
SLIDE 1

STK 4290 PARAMETRIC LIFETIME MODELING

Slides 1

Bo Lindqvist Department of Mathematical Sciences Norwegian University of Science and Technology Trondheim http://www.math.ntnu.no/∼bo/ bo@math.ntnu.no University of Oslo, Spring 2014

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 1 / 64

slide-2
SLIDE 2

LIFETIMES (WIDELY DEFINED)

Reliability engineering: Time to failure of a component or a system Number of cycles to failure (fatigue testing) Times between successive failures of a machine Medical research: Time to death of a patient after start of certain treatment Time from entrance to discharge from a hospital Times between successive epileptic seizures for patient

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 2 / 64

slide-3
SLIDE 3

RELIABILITY AND SURVIVAL

Common technical definition of reliability: The probability that a system or a component will perform its intended task, under given operational conditions, for a specified time period. Lifetime (survival time) in medical research: Time to occurrence of some event of interest for individuals in some

  • population. The event may or may not be “death”, and is often referred

to as “failure”.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 3 / 64

slide-4
SLIDE 4

WHY COLLECT AND ANALYZE LIFETIME/SURVIVAL/RELIABILITY DATA?

Reliability engineering: Assess reliability of a system/component/product Compare two or more products with respect to reliability Predict product reliability in the design phase Predict warranty claims for a product in the market Medical research: Compare different treatments with respect to survival or recurrence Predict the outcome of an intervention or the life expectancy after the invention Identify risk factors for diseases and assess their magnitude

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 4 / 64

slide-5
SLIDE 5

SPECIAL ASPECTS OF LIFETIME ANALYSIS IN STATISTICS

Definition of starting time and failure time are difficult Definition of time scale (operation time, calendar time, number of cycles) Censored data (how can we use data from individuals or units for which the event of interest has not occurred within the observation period?) Effect of covariates (demographic, medical, environmental) What if an individual or unit dies or fails of another cause than the

  • ne we would like to study? (”competing risks”)

Recurrent events – what if the system can fail several times; how to analyze recurring stages of a disease?

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 5 / 64

slide-6
SLIDE 6

BALL BEARING FAILURE DATA

Data: Millions of revolutions to fatigue failure for 23 units Question: How can we fit a parametric lifetime distribution to these data? 17,88 28,92 33,00 41,52 42,12 45,60 48,40 51,84 51,96 54,12 55,56 67,80 68,64 68,64 68,88 84,12 93,12 98,64 105,12 105,84 127,92 128,04 173,40

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 6 / 64

slide-7
SLIDE 7

BALL BEARING FAILURE DATA (EVENT PLOT)

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 7 / 64

slide-8
SLIDE 8

IC DATA (MEEKER, 1987)

Questions of interest: How to estimate the distribution of the failure time when there are censored observations? Probability of failure before 100 hours? Failure rate by 100 hours? Proportion failed after 105 hours?

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 8 / 64

slide-9
SLIDE 9

IC DATA (EVENT PLOT)

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 9 / 64

slide-10
SLIDE 10

SURVIVAL OF MULTIPLE MYELOMA PATIENTS

Multiple myeloma is a malignant disease characterised by the accumulation of abnormal plasma cells, a type of white blood cell, in the bone marrow. Data (next slide) from Medical Center of the University of West Virginia, USA. Aim: To examine the association between certain explanatory variables or covariates and the survival time of patients in months from diagnosis until death from multiple myeloma).

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 10 / 64

slide-11
SLIDE 11

MULTIPLE MYELOMA DATA

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 11 / 64

slide-12
SLIDE 12

TYPICAL EXAM EXERCISE CASE

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 12 / 64

slide-13
SLIDE 13

RECURRENT EVENTS/REPAIRABLE SYSTEMS

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 13 / 64

slide-14
SLIDE 14

VALVE SEAT REPLACEMENT DATA

Data on previous slide are collected from valve seats from a fleet of 41 diesel engines. Each engine has 16 valves. (Time unit is days of operation). Questions of interest: Does the replacement rate increase with age? How many replacement valves will be needed in the future? Can valve life in these systems be modeled as a renewal process?

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 14 / 64

slide-15
SLIDE 15

ESTIMATED NUMBER OF VALVE SEAT REPLACEMENTS

Middle curve is cumulative estimated number of replacements for one engine, as a function of age. Lower and upper curves are 95% confidence limits.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 15 / 64

slide-16
SLIDE 16

LIFETIME

The lifetime T of an individual or unit is a positive and continuously distributed random variable. The probability density function (pdf) is usually called f (t), the cumulative distribution function (cdf) F(t) is then given by F(t) = P(T ≤ t) = t

0 f (u)du,

the reliability (or: survival) function is defined as R(t) = P(T > t) = 1 − F(t) = ∞

t

f (u)du.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 16 / 64

slide-17
SLIDE 17

EXAMPLE: EXPONENTIAL DISTRIBUTION

f (t) = λe−λt F(t) = 1 − e−λt R(t) = e−λt

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 17 / 64

slide-18
SLIDE 18

INTERPRETATION OF DENSITY FUNCTION

f (t) = F ′(t) P(a < T ≤ b) = b

a

f (u)du = F(b) − F(a) P(t < T ≤ t + h) = t+h

t

f (u)du ≈ f (t) · h Hence, f (t) ≈ P(t < T ≤ t + h) h

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 18 / 64

slide-19
SLIDE 19

HAZARD FUNCTION OF T

Suppose we know that unit is alive (functioning) at time t, i.e. T > t. Then it is of interest to consider P(t < T ≤ t + h|T > t) = P(t < T ≤ t + h) P(T > t) ≈ f (t)h R(t) (Recall conditional probability: P(A|B) = P(A ∩ B)/P(B). From this we define the hazard function (also called hazard rate or failure rate) of T at time t by: z(t) = lim

h→0

P(t < T ≤ t + h|T > t) h = f (t) R(t) Example: For the exponential distribution we have f (t) = λe−λt and R(t) = e−λt, so z(t) = f (t) R(t) = λ (not depending on time!).

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 19 / 64

slide-20
SLIDE 20

USEFUL RELATIONS BETWEEN FUNCTIONS DESCRIBING T

Since F(t) = 1 − R(t) we get, f (t) = F ′(t) = −R

′(t), and hence

z(t) = f (t) R(t) = −R

′(t)

R(t) Thus we can write, d dt

  • ln R(t)
  • = −z(t)

⇒ ln R(t) = − t z(u)du + c ⇒ R(t) = e−

t

0 z(u)du+c

Since R(0) = 1, we have c = 0, so R(t) = e−

t

0 z(u)du ≡ e−Z(t)

where Z(t) = t

0 z(u)du is called the cumulative hazard function.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 20 / 64

slide-21
SLIDE 21

USEFUL RELATIONS (CONT.)

Recall from last slide: Z(t) = t

0 z(u)du

z(t) = Z ′(t) R(t) = e−Z(t) Since f (t) = F ′(t) = −R′(t), it follows that f (t) = z(t)e−

t

0 z(u)du = z(t)e−Z(t)

(1) For exponential distribution: Z(t) = t λdu = λt so (1) gives (the well known formula) f (t) = λe−λt

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 21 / 64

slide-22
SLIDE 22

OVERVIEW OF FUNCTIONS DESCRIBING DISTRIBUTION OF LIFETIME T

Function Formula Exponential distr Density (pdf) f(t) = λe−λt

  • Cum. distr. (cdf)

F(t) = 1 − e−λt Rel/surv function R(t) = 1 − F(t) = e−λt Hazard function z(t) = f (t)/R(t) = λ Cum hazard function Z(t) = t

0 z(u)du

= λt R(t) = e−Z(t) = e−λt f (t) = z(t)e−Z(t) = λe−λt

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 22 / 64

slide-23
SLIDE 23

EXERCISES

1 Suppose the reliability function of T is R(t) = e−t1.7.

Find the functions F(t), f (t), z(t), Z(t).

2 Show that if you get to know only one of the functions

R(t), F(t), f (t), z(t), Z(t), then you can still compute all the other!

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 23 / 64

slide-24
SLIDE 24

BATHTUB CURVE

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 24 / 64

slide-25
SLIDE 25

MORE ON THE HAZARD FUNCTION

Recall that z(t) = limh→0

P(t<T≤t+h|T>t) h

. Thus z(t)h ≈ P(t < T ≤ t + h|T > t) = P(fail in (t, t + h)| alive at t) Suppose a typical T is large compared to time unit. Then for h = 1: z(t) ≈ P(t < T ≤ t + 1|T > t) = P(fail in next time unit |alive at t) Thus: Suppose we have n units of age t. How many can we expect to fail in next time unit? e = n · z(t) In practice: Ask an expert: “If you have 100 components (of specific type)

  • f age 1000 hours. How many do you expect to fail in the next hour”?

Answer is, say, “2”. Looking at e = n · z(t) we estimate; ˆ z(1000) = 2 100 = 0.02

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 25 / 64

slide-26
SLIDE 26

MORTALITY TABLE - DEATH HAZARD BY AGE

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 26 / 64

slide-27
SLIDE 27

MORTALITY TABLE - DEATH HAZARD BY AGE

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 27 / 64

slide-28
SLIDE 28

EXERCISES

Let T be the lifetime of a Norwegian measured in years. Let zM(t) be the hazard function for a male person as a function of the age t, while zF(t) is the corresponding function for a female. Look at the Mortality tables in the previous two slides and estimate zM(21) and zF(21). Compare them and comment. Do the same at age 72 years.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 28 / 64

slide-29
SLIDE 29

MEAN TIME TO FAILURE (MTTF); EXPECTED LIFETIME

For a lifetime T we define MTTF = E(T) = ∞ tf (t)dt = ∞ R(t)dt (The last equality is proven by partial integration, noting that R′(t) = −f (t). Do it!) Var(T) = ∞ (t − E(T))2f (t)dt = ∞ t2f (t)dt − (E(T))2 = E(T 2) − (E(T))2 SD(T) = (Var(t))1/2

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 29 / 64

slide-30
SLIDE 30

EXAMPLE: EXPONENTIAL DISTRIBUTION

Let T be exponentially distributed with density f (t) = λe−λt. Then you may check the following computations: E(T) = ∞ tλe−λtdt = ∞ e−λtdt = 1 λ Var(T) = E(T 2) − (E(T))2 = 2 λ2 − 1 λ2 = 1 λ2 SD(T) = 1 λ Thus: For a component with exponentially distributed lifetime, MTTF = 1/failure rate NOTE: We will mainly use the parameterization f (t) = 1

θe−t/θ, so that

R(t) = e−t/θ, E(T) = θ, SD = θ

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 30 / 64

slide-31
SLIDE 31

WEIBULL DISTRIBUTION

The lifetime T is Weibull-distributed with shape parameter α > 0 and scale parameter θ > 0, written T ∼ Weib(α, β), if R(t) = e−( t

θ )α

From this we can derive: Z(t) = t θ α z(t) = α θ t θ α−1 f (t) = z(t)e−Z(t) = α θ t θ α−1 e−( t

θ )α

α = 1 corresponds to the exponential distribution; α < 1 gives a decreasing failure rate (DFR); α > 1 gives an increasing failure rate (IFR).

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 31 / 64

slide-32
SLIDE 32

WEIBULL DISTRIBUTION (CONT.)

E(T) = ∞ R(t)dt = ∞ e−( t

θ )αdt = · · · = θ · Γ

1 α + 1

  • where Γ(·) is the gamma-function defined by Γ(a) =

0 ua−1e−udu.

Var(T) = θ2

  • Γ

2 α + 1

  • − Γ2

1 α + 1

  • SD(T)

= θ

  • Γ

2 α + 1

  • − Γ2

1 α + 1 1/2

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 32 / 64

slide-33
SLIDE 33

WEIBULL DISTRIBUTION (CONT.)

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 33 / 64

slide-34
SLIDE 34

NORMAL DISTRIBUTION

Standard normal distribution, Z ∼ N(0, 1): fZ(z) = φ(z) = 1 √ 2π e− z2

2

FZ(z) = Φ(z) = z

−∞

φ(w)dw Now Let Y ∼ N(µ, σ). Then it is well known that FY (y) = P(Y ≤ y) = Φ y − µ σ

  • MY (t)

= E(etY ) = eµt+ 1

2 σ2t2 (moment generating function)

Further, if we let Z = Y −µ

σ

∼ N(0, 1), then Y = µ + σZ. Thus: The model Y ∼ N(µ, σ) is a location–scale family, defined by the standardized random variable Z, with location parameter µ and scale parameter σ.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 34 / 64

slide-35
SLIDE 35

EXERCISE

1 Consider Y = µ + σZ where Z ∼ N(0, 1). What is the distribution of

Y ? Why are the names location parameter and scale parameter appropriate for, respectively, µ and σ?

2 The normal distribution is sometimes used as a lifetime distribution

(in fact it is a possible choice in MINITAB). What is a possible problem with this distribution?

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 35 / 64

slide-36
SLIDE 36

LOGNORMAL DISTRIBUTION

The lifetime T has a lognormal distribution with parameters µ and σ if Y ≡ ln T is normally distributed, Y ∼ N(µ, σ). We can hence write Y = ln T = µ + σZ (∗) where Z ∼ N(0, 1). Here µ is called the location parameter and σ is called the scale parameter

  • f the lognormal distribution.

Because of (*) we say that the lognormal distribution is a log-location-scale family of distributions, meaning that the log of T defines a location-scale family.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 36 / 64

slide-37
SLIDE 37

LOGNORMAL DISTRIBUTION (CONT.)

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 37 / 64

slide-38
SLIDE 38

EXERCISE

m = median(T) is defined by F(m) = R(m) = 1/2. Compute the median m when T is

1 Exponentially distributed with parameter θ, i.e. T ∼ Expon(θ) 2 T ∼ Weib(α, θ) 3 T ∼ lognormal(µ, σ) Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 38 / 64

slide-39
SLIDE 39

FUNCTIONS FOR THE LOGNORMAL DISTRIBUTION

Recall: T ∼ lognormal(µ, σ) ⇐ ⇒ lnT ∼ N(µ, σ) Thus R(t) = P(T > t) = P(lnT > lnt) = 1 − Φ lnt − µ σ

  • and

f (t) = −R′(t) = φ lnt − µ σ

  • · 1

tσ = 1 √ 2π · 1 σ · 1 t e− (lnt−µ)2

2σ2

for t > 0

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 39 / 64

slide-40
SLIDE 40

HAZARD FUNCTION OF THE LOGNORMAL DISTRIBUTION

z(t) = f (t) R(t) = φ lnt−µ

σ

  • /(tσ)

1 − Φ lnt−µ

σ

  • Bo Lindqvist

Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 40 / 64

slide-41
SLIDE 41

MORE RESULTS FOR THE LOGNORMAL DISTRIBUTION

Let Y = ln T. Then Y ∼ N(µ, σ). Recall: MY (t) = E(etY ) = eµt+ 1

2 σ2t2

Thus: E(T) = E(eY ) = MY (1) = eµ+ 1

2 σ2

E(T 2) = E(e2Y ) = MY (2) = e2µ+2σ2 Var(T) = e2µ+2σ2 − e2µ+σ2 = e2µ+σ2(eσ2 − 1) On the other hand, median(T) = eµ since P(T ≤ eµ) = P(ln T ≤ µ) = P(Y ≤ µ) = 1/2.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 41 / 64

slide-42
SLIDE 42

RECALL BALL BEARING FAILURE DATA

17,88 28,92 33,00 41,52 42,12 45,60 48,40 51,84 51,96 54,12 55,56 67,80 68,64 68,64 68,88 84,12 93,12 98,64 105,12 105,84 127,92 128,04 173,40 Question: How can we fit a parametric lifetime model to these data?

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 42 / 64

slide-43
SLIDE 43

BB-DATA: EXPONENTIAL DISTRIBUTION (MINITAB)

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 43 / 64

slide-44
SLIDE 44

BB-DATA: WEIBULL DISTRIBUTION (MINITAB)

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 44 / 64

slide-45
SLIDE 45

BB-DATA: LOGNORMAL DISTRIBUTION (MINITAB)

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 45 / 64

slide-46
SLIDE 46

BB-DATA: HISTOGRAM OF LOG-LIFETIMES

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 46 / 64

slide-47
SLIDE 47

BB-DATA: EMPIRICAL DISTRIBUTION COMPARED TO PARAMETRIC FITS

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 47 / 64

slide-48
SLIDE 48

ESTIMATION RESULTS FOR BALL BEARING DATA

Model

  • MTTF
  • STD(T)
  • med(T)

ˆ α ˆ θ ˆ µ ˆ σ Exp 72.221 72.221 50.060 72.221 Weib 72.515 36.250 68.773 2.102 81.875 Logn 72.710 40.664 63.458 4.150 0.522 Norm 72.221 36.667 72.221 72.221 36.667 Method: Maximum likelihood.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 48 / 64

slide-49
SLIDE 49

MOTIVATION FOR EXPONENTIAL DISTRIBUTION

Simplest distribution used in the analysis of reliability data. Has the important characteristic that its hazard function is constant (does not depend on time t). Popular distribution for some kinds of electronic components (e.g., capacitors or robust, high-quality integrated circuits). Might be useful to describe failure times for components that exhibit physical wearout only after expected technological life of the system, in which the component would be replaced.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 49 / 64

slide-50
SLIDE 50

MOTIVATION FOR WEIBULL DISTRIBUTION

The theory of extreme values shows that the Weibull distribution can be used to model the minimum of a large number of independent positive random variables from a certain class of distributions.

Failure of the weakest link in a chain with many links with failure mechanisms (e.g. fatigue) in each link acting approximately independently. Failure of a system with a large number of components in series and with approximately independent failure mechanisms in each component.

The more common justification for its use is empirical: the Weibull distribution can be used to model failure-time data with a decreasing

  • r an increasing hazard function.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 50 / 64

slide-51
SLIDE 51

MOTIVATION FOR LOGNORMAL DISTRIBUTION

The lognormal distribution is a common model for failure times. It can be justified for a random variable that arises from the product

  • f a number of identically distributed independent positive random

quantities (remember central limit theorem for sum of normals). It has been suggested as an appropriate model for failure time caused by a degradation process with combinations of random rates that combine multiplicatively. Widely used to describe time to fracture from fatigue crack growth in metals.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 51 / 64

slide-52
SLIDE 52

EXTREME VALUE DISTRIBUTIONS

Let T1, T2, · · · , Tn be lifetimes of n components, with ordered values denoted by T(1) < T(2) < · · · < T(n). Thus T(1) is the minimum and corresponds to the lifetime of a series system. For large n, T(1) is approximately Weibull-distributed. This motivates the widespread use of the Weibull-distribution! If the Ti are no longer lifetimes, but have support in (−∞, ∞), then the limiting distribution of a properly normalized version of T(1) equals the distribution of a random variable Y with cdf FY (y) = 1 − e−e

y−µ σ ,

−∞ < y < ∞ This is the so called “Distribution of smallest extreme”, or “Extreme value distribution of type I”, or (which we will call it) the Gumbel-distribution; Y ∼ Gumbel(µ, σ). We write Y ∼ Gumbel(µ, σ)

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 52 / 64

slide-53
SLIDE 53

EXERCISE

Show that FY (y) = 1 − e−e

y−µ σ ,

−∞ < y < ∞ satisfies the requirements for a cdf, i.e. Increasing in y limy→−∞ FY (y) = 0 limy→∞ FY (y) = 1

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 53 / 64

slide-54
SLIDE 54

WHY ARE WE INTERESTED IN THE GUMBEL DISTRIBUTION?

If T is Weibull-distributed, T ∼ Weib(α, θ), then Y = ln T is Gumbel-distributed, Y ∼ Gumbel(µ, σ), with µ = ln θ, σ = 1/α. Proof: Note first that T = eY and R(t) = P(T > t) = e−

  • t

θ

α . Then: P(Y > y) = P(eY > ey) = P(T > ey) = R(ey) = e−

  • ey

θ

α = e−

  • ey

eln θ

α = e−(ey−ln θ)α = e−e

  • y−ln θ

1/α

  • Thus, FY (y) = 1 − P(Y > y) = 1 − e−e
  • y−ln θ

1/α

  • , which shows that

Y ∼ Gumbel(ln θ, 1/α). We shall see later why this is a useful and interesting result (and not just a curiosity...)

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 54 / 64

slide-55
SLIDE 55

THE GUMBEL DISTRIBUTION

Let Y ∼ Gumbel(µ, σ) and recall the cdf FY (y) = P(Y ≤ y) = 1 − e−e

y−µ σ

for − ∞ < y < ∞ The cdf of Gumbel(0,1), called the standard Gumbel distribution, is G(w) = 1 − e−ew for − ∞ < w < ∞ It is seen that FY (y) = P(Y ≤ y) = G y − µ σ

  • This defines the cdf of the Gumbel(µ, σ) in terms of the cdf of the

standard Gumbel, in the same way as the cdf of Y ∼ N(µ, σ) can be expressed by the cdf, Φ(·), of the standard normal. How?

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 55 / 64

slide-56
SLIDE 56

THE GUMBEL DISTRIBUTION (CONT.)

Let again Y ∼ Gumbel(µ, σ), and define W = Y − µ σ (∗) Then W has the standard Gumbel distribution. This is seen as follows: P(W ≤ w) = P Y − µ σ ≤ w

  • =

P(Y ≤ µ + σw) = FY (µ + σw) = 1 − e−ew ≡ G(w) By solving (*) for Y it follows that we have the representation for Y ∼ Gumbel(µ, σ): Y = µ + σW where W ∼ Gumbel(0, 1).

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 56 / 64

slide-57
SLIDE 57

MORE ON THE STANDARD GUMBEL DISTRIBUTION

Recall once more that if W ∼ Gumbel(0, 1), then W has the cdf G(w) = 1 − e−ew The pdf of W is hence g(w) = G ′(w) = −e−ew (−ew) = ewe−ew We also have E(W ) = ∞

−∞

wewe−ew dw = −γ, where γ = −0.5772 is Euler’s constant.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 57 / 64

slide-58
SLIDE 58

STANDARD GUMBEL AND NORMAL DISTRIBUTIONS

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 58 / 64

slide-59
SLIDE 59

GENERAL LOG-LOCATION-SCALE FAMILIES

We have seen: T ∼ lognorm(µ, σ) ⇐ ⇒ Y = ln T ∼ N(µ, σ) T ∼ Weib(α, θ) ⇐ ⇒ Y = ln T ∼ Gumbel(µ, σ), with µ = ln θ, σ = 1/α. Both distributions thus define log-location-scale families, which are characterized by the fact that Y = ln T has a cdf which can be expressed as FY (y) = P(Y ≤ y) = Ψ y − µ σ

  • where Ψ(·) is the cdf of some “standardized distribution” on

(−∞, ∞). Equivalently, log-location-scale families are characterized by representations lnT = µ + σU where U has cdf Ψ(·) as described above.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 59 / 64

slide-60
SLIDE 60

GENERAL LOG-LOCATION-SCALE FAMILIES (CONT.)

Generally, µ ∈ (−∞, +∞) is called the location parameter, and σ > 0 is called the scale parameter.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 60 / 64

slide-61
SLIDE 61

THE LOGISTIC AND LOG-LOGISTIC DISTRIBUTIONS

A random variable Y has the logistic distribution with location parameter µ and scale parameter σ, written Y ∼ logistic(µ, σ), if FY (y) = P(Y ≤ y) = H y − µ σ

  • for − ∞ < y < ∞

where H(v) = P(V ≤ v) = ev 1 + ev for − ∞ < v < ∞ is the cdf of the standard logistic distribution, logistic(0,1). A lifetime T has the log-logistic distribution with location parameter µ and scale parameter σ if Y = ln T ∼ logistic(µ, σ). In this case we have the representation ln T = µ + σV where V ∼ logistic(0, 1).

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 61 / 64

slide-62
SLIDE 62

THE STANDARD LOGISTIC DISTRIBUTION

Recall that if V ∼ logistic(0, 1), then the cdf of V is H(v) = P(V ≤ v) =

ev 1+ev for − ∞ < v < ∞.

Hence the pdf of V is h(v) = H′(v) = ev (1 + ev)2 (do the differentiation!) Like the standard normal, this density is symmetric around the y-axis (which is not the case for the standard Gumbel). Check this by showing that h(−v) = h(v) for all v.

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 62 / 64

slide-63
SLIDE 63

STANDARD LOGISTIC AND STANDARD NORMAL

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 63 / 64

slide-64
SLIDE 64

FUNCTIONS FOR LOG-LOCATION-SCALE FAMILIES

By assumption, Y = ln T has a cdf which can be expressed as FY (y) = P(Y ≤ y) = Ψ y − µ σ

  • for − ∞ < y < ∞

where Ψ(·) is the cdf of a standard distribution. Let further ψ(u) = Ψ′(u). Then R(t) = P(T > t) = P(lnT > lnt) = 1 − Ψ lnt − µ σ

  • f (t)

= −R′(t) = ψ lnt − µ σ

  • · 1

tσ z(t) = f (t) R(t) = ψ lnt−µ

σ

  • /(tσ)

1 − Ψ lnt−µ

σ

  • (as already obtained for the lognormal distribution).

Bo Lindqvist Slides 1 () STK 4290 PARAMETRIC LIFETIME MODELING 64 / 64