Parametric bootstrap August 30, 2017 Resampling from the data or - - PowerPoint PPT Presentation

parametric bootstrap
SMART_READER_LITE
LIVE PREVIEW

Parametric bootstrap August 30, 2017 Resampling from the data or - - PowerPoint PPT Presentation

Resampling from the data or from distribution Simple Example Spline Example Parametric bootstrap August 30, 2017 Resampling from the data or from distribution Simple Example Spline Example Bootstrap + Monte Carlo = Parametric Bootstrap


slide-1
SLIDE 1

Resampling from the data or from distribution Simple Example Spline Example

Parametric bootstrap

August 30, 2017

slide-2
SLIDE 2

Resampling from the data or from distribution Simple Example Spline Example

Bootstrap + Monte Carlo = Parametric Bootstrap

slide-3
SLIDE 3

Resampling from the data or from distribution Simple Example Spline Example

There is no more in data, than the data – one view

The bootstrap is a general tool for assessing statistical accuracy by ‘creating’ data from the data. It is based on sampling randomly from data to study how a quantity of interest behaves when observed in this process It is used to assess the variability of a certain characteristics

slide-4
SLIDE 4

Resampling from the data or from distribution Simple Example Spline Example

There is a model behind the data – another view

Study theoretically a mathematical model Fit statistically model using the data Use the theory to assess variability or other properties

What if the model is difficult to study?

slide-5
SLIDE 5

Resampling from the data or from distribution Simple Example Spline Example

Combine model with sampling

Fit statistically theoretical model using the data Take Monte Carlo samples from the fitted model to investigate variability or other properties We use the model to get new samples as oppose to the non-parametric bootstrap where the samples are from the data directly Since the model is fitted from the data, so the data are indirectly used

slide-6
SLIDE 6

Resampling from the data or from distribution Simple Example Spline Example

Simple example – bootstrap and Monte Carlo

Bootstrap #Data x=scan("Table2_1.txt") n=length(x) mean(x) sd(x) #Bootstrapping variances B=1000 Bvar=vector(’numeric’,B) for(i in 1:B) { Bvar[i]=var(sample(x,n,rep=T)) } sd(Bvar) hist(Bvar,nclass=10) Monte Carlo #Monte Carlo study of variances N=15000 MCvar=vector(’numeric’,N) for(i in 1:N){ MCx=rnorm(n,50,0.1) MCvar[i]=var(MCx) } mean(MCvar) sd(MCvar) X11() #graphical window in Unix #windows() in Windows #quartz() in Mac hist(MCvar,nclass=10)

slide-7
SLIDE 7

Resampling from the data or from distribution Simple Example Spline Example

Simple example – parametric bootstrap

Fitting the model by a normal model #Fit the model -- #

  • - normal distribution

#Data x=scan("Table2_1.txt") n=length(x) mu=mean(x) sigma=sd(x) Parametric bootstrap #Simulate from the fit PB=1000 PBvar=vector(’numeric’,PB) PBsd=PBvar for(i in 1:PB) { PBx=rnorm(n,mu,sigma) PBvar[i]=var(PBx) } mean(PBvar) sd(PBvar) X11() hist(PBvar,nclass=10)

slide-8
SLIDE 8

Resampling from the data or from distribution Simple Example Spline Example

Fitting cubic splines - B-spline basis

We fit the data on the right by the cubic B-splines hj(x), j = 1, ..., 7 on the left. Review question: Why there are seven B-splines?

slide-9
SLIDE 9

Resampling from the data or from distribution Simple Example Spline Example

Fitting through linear regression

We look for a fit of the form µ(x) =

7

  • j=1

βjhj(x). From the standard regression solution we get ˆ β = (HTH)−1HTy so that the fit is ˆ µ(x) =

7

  • j=1

ˆ βjhj(x).

slide-10
SLIDE 10

Resampling from the data or from distribution Simple Example Spline Example

Assessing uncertainty of the fit

We have obtained the fit but we want to assess its uncertainty (variability). The concept of variability of a curve is not that straightforward as the variability of a point estimate – there can be many ways to define it. The best is to observe how curve can vary for different fits to the model For these we need many samples of data Bootstrap can be suitable But how to resample from the data? One could resample directly from the data (both y’s and x’s). However when variability of x’s is not of interest, it is better to sample from the residuals.

slide-11
SLIDE 11

Resampling from the data or from distribution Simple Example Spline Example

Resampling from the residuals

Residuals ˆ ε = y − ˆ µ(x) Compute bootstrap samples ε∗ from the residuals ˆ ε. For each new sample ε∗ evaluate bootstrap version of the output data y∗ = ˆ µ(x) + ε∗ Fit new cubic splines to each bootstrap sample, plot them on the graph.

slide-12
SLIDE 12

Resampling from the data or from distribution Simple Example Spline Example

Parametric bootstrap

One can assume the normal model for errors with the mean zero and the variance ˆ σ2 =

N

  • i=1

(yi − ˆ µ(xi))2/N Compute parametric bootstrap samples ε∗ by sampling from N(0, ˆ σ2). For each new sample ε∗ evaluate bootstrap version of the output data y∗ = ˆ µ(x) + ε∗ Fit new cubic splines to each bootstrap sample, plot them on the graph. The result will be similar as seen on the previous graph (but not identical).