tr t rtr ts - - PowerPoint PPT Presentation

tr t r tr t s
SMART_READER_LITE
LIVE PREVIEW

tr t rtr ts - - PowerPoint PPT Presentation

tr t rtr ts trt rtr stt s t stt


slide-1
SLIDE 1

✶✴✸✹

■♥tr♦ t♦ ♥♦♥✲♣❛r❛♠❡tr✐❝ ♠❡t❤♦❞s

slide-2
SLIDE 2

✷✴✸✹

■♥tr♦❞✉❝t✐♦♥

◆♦♥✲♣❛r❛♠❡tr✐❝ ❡st✐♠❛t✐♦♥ ❛✐♠s t♦ ❡st✐♠❛t❡ ❛♥ ✉♥❦♥♦✇♥ q✉❛♥t✐t② ✇❤✐❧❡ ♠❛❦✐♥❣ ❛s ❢❡✇ ❛ss✉♠♣t✐♦♥s ❛s ♣♦ss✐❜❧❡ ✭❛❜♦✉t t❤❡ ❞❛t❛ ❣❡♥❡r❛t✐♥❣ ♣r♦❝❡ss✮ ❲❡ ✇✐❧❧ ❧♦♦❦ ❛t t❤❡ ❢♦❧❧♦✇✐♥❣

◮ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥ ◮ r❡❣r❡ss✐♦♥

◮ ❧♦❝❛❧ ❝♦♥st❛♥t ◮ ❧♦❝❛❧ ❧✐♥❡❛r

◆♦♥✲♣❛r❛♠❡tr✐❝ ♠❡t❤♦❞s ❛r❡ ❧♦❝❛❧ ❛✈❡r❛❣✐♥❣ ♠❡t❤♦❞s ❑❡② ❝♦♥❝❡r♥✿ ❤♦✇ t♦ ❞❡✜♥❡ ✧❧♦❝❛❧✧

slide-3
SLIDE 3

✸✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❊①❛♠♣❧❡

0.0000 0.0001 0.0002 0.0003 0.0004 Density 2000 4000 6000 8000 10000 expenditure per pupil 0.0000 0.0001 0.0002 0.0003 0.0004 Density 2000 4000 6000 8000 10000 expenditure per pupil 0.0000 0.0001 0.0002 0.0003 0.0004 Density 2000 4000 6000 8000 10000 expenditure per pupil 0.0000 0.0001 0.0002 0.0003 0.0004 Density 2000 4000 6000 8000 10000 expenditure per pupil

slide-4
SLIDE 4

✹✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❍✐st♦❣r❛♠ Nonparametm'c Density and Regression Estimation

15 Figure 2 Doubts About the Doubs: Bertillon's Histogram of the Heights of 9,002 Conscripts from Doubs

Height in Inches

where the function I(.) is equal to one if the statement is true and zero otherwise, and x, represents the bin midpoint. Once it is written this way, the problem with the histogram becomes clearer. The shape of the histogram can potentially be influenced by where you place the bin centers. Moreover, with a histogram, choosing the width of the bins and the location of the first bin also determines the choice of bin centers. In Bertillon's case, he "chose" to estimate his probability density function at the points 58.5 inches, 59.5 inches, 60.5 inches, . . . Suppose he had estimated his probability density function at 58 inches, 59 inches, 60 inches, . . . instead? Would it matter? Figure 3 (with simulated data generated to approximate Bertillon's

  • riginal data) illustrates the potential problem. Note that consistent with our

viewing the histogram as an estimate of the probability density function for the center of the bins, we have dropped the usual bars and have labeled the axis

  • suitably. As in Bertillon's case, the choice of bin centers clearly matters: with our

simulated data, a possibly more precise conversion from centimeters to inches, and bin centers at the inch, the middle hump disappears. The modern kernel density estimator differs from Bertillon's histogram esti- mator as described in the previous equation in two key ways. First, in a typical kernel density estimator, the bins are allowed to "overlap." This severs the link between bin size and bin centers that characterizes the histogram and is one reason to prefer the more sophisticated kernel density estimator. Second, kernel density estimators typically place diminishing weight on data points as they move farther away from our point xo,while the histogram

slide-5
SLIDE 5

✺✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❍✐st♦❣r❛♠

1

16 Journal of Economic Perspectives Figure 3 The Sensitivity of the Histogram to Alternate Bin Centers

Bin Centers on the Inch

54 56 58 60 62 64 66 68 70 72

Height of Conscripts in Inches

assigns equal weight to all points falling in the bin. On this latter point, note that the indicator function in the earlier equation simply serves to count the number of data points lying in the bin, while potentially one could assign differing weights to points falling in the bin depending on their "closeness" to

%. Before formally describing the relationship between these two estimators, we

first define two terms that are key ingredients in a kernel density estimator: the bandwidth and the kernel. What is a bandwidth? In a histogram, the bandwidth is the width of the bin divided by two. The bandwidth tells you how far to look to the left and to the right

  • f x

, when computing the probability density function at x,. A bandwidth of i inch, for example, would typically say that to estimate the probability density function at xo, you would consider all points that are within inch of x,. Bertillon's 1-inch bins imply a i inch bandwidth. More generally, one can think of a bandwidth as simply a parameter used to determine the size of the "neighborhood" around xo-large bandwidths define large neighborhoods, and small bandwidths define small ones. In the histogram, points falling outside the neighborhood receive zero weight, while those falling in the neighborhood receive constant weight. What is a kernel? It is merely a smoothing or weight-assigning function. Bertillon's kernel (the kernel used in any histogram) is today called a rectan- gular kernel-so-called because it treats all points in a bin the same. In modern

slide-6
SLIDE 6

✻✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❍✐st♦❣r❛♠

❘❡♠❡♠❜❡r f (x) = dF(x)/dx s♦ f (x) = ❧✐♠

h→✵

F(x + h) − F(x − h) ✷h = ❧✐♠

h→✵

Pr(x − h < X < x + h) ✷h t❤❡ s❛♠♣❧❡ ❛♥❛❧♦❣ ♦❢ ✇❤✐❝❤ ✐s ˆ fHist(x) = ✶ N

N

  • i=✶

✶[x − h < Xi < x + h] ✷h = ✶ Nh

N

  • i=✶

✶ ✷ × ✶

  • Xi − x

h

  • < ✶
slide-7
SLIDE 7

✼✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❑❡r♥❡❧ ❞❡♥s✐t② ❡st✐♠❛t♦r

❚❤❡ ❤✐st♦❣r❛♠ ❡st✐♠❛t♦r ❝❛♥ ❜❡ ❣❡♥❡r❛❧✐③❡❞ ˆ f (x) = ✶ Nh

N

  • i=✶

K Xi − x h

  • ✇❤❡r❡

◮ t❤❡ ✇❡✐❣❤t✐♥❣ ❢✉♥❝t✐♦♥ K(·) ✐s ❝❛❧❧❡❞ ❛ ❦❡r♥❡❧ ❢✉♥❝t✐♦♥ ◮ h ✐s ❛ s♠♦♦t❤✐♥❣ ♣❛r❛♠❡t❡r ❝❛❧❧❡❞ t❤❡ ❜❛♥❞✇✐❞t❤✳

❋♦r ˆ f → f ✇❡ r❡q✉✐r❡ t❤❛t Nh → ∞ ❛♥❞ h → ✵✳

slide-8
SLIDE 8

✽✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❑❡r♥❡❧ ❞❡♥s✐t② ❡st✐♠❛t♦r

■t ✐s ✉s✉❛❧❧② ❛ss✉♠❡❞ t❤❛t K(·)

◮ ✐s s②♠♠❡tr✐❝ K(z) = K(−z) ◮ ✐♥t❡❣r❛t❡s t♦ ✶

  • K(z)dz = ✶

❤❛s ③❡r♦ ♠❡❛♥

  • zK(z)dz = ✵

❛♥❞ ❛ ✜♥✐t❡ s❡❝♦♥❞ ♠♦♠❡♥t

  • z✷K(z)dz = κ✷ < ∞
slide-9
SLIDE 9

✾✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❑❡r♥❡❧ ❞❡♥s✐t② ❡st✐♠❛t♦r ✕ ❙♦♠❡ ❝♦♠♠♦♥ ❦❡r♥❡❧s

❑❡r♥❡❧ K(z) δ ❯♥✐❢♦r♠

✶ ✷ × ✶[|z| < ✶]

✶✳✸✺✶✵ ❚r✐❛♥❣✉❧❛r (✶ − |z|) × ✶[|z| < ✶] ✕ ❊♣❛♥❡❝❤♥✐❦♦✈

✸ ✹(✶ − z✷) × ✶[|z| < ✶]

✶✳✼✶✽✽ ❇✐✇❡✐❣❤t

✶✺ ✶✻(✶ − z✷)✷ × ✶[|z| < ✶]

✷✳✵✸✻✷

  • ❛✉ss✐❛♥

(✷π)−✶/✷ ❡①♣(−z✷/✷) ✵✳✼✼✻✹

slide-10
SLIDE 10

✶✵✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❑❡r♥❡❧ ❞❡♥s✐t② ❡st✐♠❛t♦r ✕ ❙♦♠❡ ❝♦♠♠♦♥ ❦❡r♥❡❧s

0.0 0.2 0.4 0.6 0.8 −3 −2 −1 1 2 3

x y1

slide-11
SLIDE 11

✶✶✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❑❡r♥❡❧ ❞❡♥s✐t② ❡st✐♠❛t♦r

❲❡ ❝❛♥ s❤♦✇ t❤❛t

  • ˆ

f (x)dx = ✶

  • x ˆ

f (x)dx = ✶ n

n

  • i=✶

Xi Var(ˆ f (x)) = ˆ σ✷ + h✷κ✷ ✇❤❡r❡ ˆ σ✷ ✐s t❤❡ s❛♠♣❧❡ ✈❛r✐❛♥❝❡✳ ◆♦t❡✿ t❤❡s❡ ❛r❡ ♥✉♠❡r✐❝❛❧ ♠♦♠❡♥ts✱ ♥♦t s❛♠♣❧✐♥❣ ♠♦♠❡♥ts

slide-12
SLIDE 12

✶✷✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❑❡r♥❡❧ ❞❡♥s✐t② ❡st✐♠❛t♦r ✕ ❊st✐♠❛t✐♦♥ ❜✐❛s

❊①♣❡❝t❛t✐♦♥s ♦❢ ❦❡r♥❡❧ tr❛♥s❢♦r♠❛t✐♦♥s E[✶ hK(Xi − x h )] = ✶ hK(z − x h )f (z)dz =

  • K(u)f (x + hu)du

✇❤❡r❡ u = (z − x)/h✱ s♦ E[ˆ f (x)] = E[ ✶ Nh

N

  • i=✶

K Xi − x h

  • ]

= ✶ N

N

  • i=✶

E[✶ hK Xi − x h

  • ] =
  • K(u)f (x + hu)du

✇❤✐❝❤ ✭t②♣✐❝❛❧❧②✮ ❝❛♥♥♦t ❜❡ s♦❧✈❡❞ ❛♥❛❧②t✐❝❛❧❧②

slide-13
SLIDE 13

✶✸✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❑❡r♥❡❧ ❞❡♥s✐t② ❡st✐♠❛t♦r ✕ ❊st✐♠❛t✐♦♥ ❜✐❛s

❆ss✉♠❡ ❛ ✷♥❞ ♦r❞❡r ❦❡r♥❡❧✿ κj = ✵ ❢♦r j < ✷ ❙✉❜st✐t✉t✐♥❣ ❛ ✷♥❞ ♦r❞❡r ❚❛②❧♦r ❡①♣❛♥s✐♦♥ f (x + hu) ≈ f (x) + f ′(x)hu + ✶ ✷f ′′(x)h✷u✷ ✐♥

  • K(u)f (x + hu)du ❣✐✈❡s

E[ˆ f (x)] =

  • K(u)f (x + hu)du ≈ f (x) + ✶

✷f ′′(x)h✷κ✷ ❛♥❞ t❤❡ ❜✐❛s ❡q✉❛❧s Bias(ˆ f (h)) = E[ˆ f (h)] − f (x) ≈ ✶ ✷f ′′(x)h✷κ✷ ✭❤✐❣❤❡r ♦r❞❡r ❦❡r♥❡❧s ❤❛✈❡ ❧♦✇❡r ♦r❞❡r ❜✐❛s✮

slide-14
SLIDE 14

✶✹✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❑❡r♥❡❧ ❞❡♥s✐t② ❡st✐♠❛t♦r ✕ ❊st✐♠❛t✐♦♥ ❜✐❛s

❋♦r t❤❡ ✈❛r✐❛♥❝❡ ✇❡ ❣❡t Var(ˆ f (x)) = Var( ✶ Nh

N

  • i=✶

K Xi − x h

  • ) =

✶ Nh✷ Var(K Xi − x h

  • )

= ✶ Nh✷ E[K Xi − x h ✷ ] − ✶ N (✶ hE[K Xi − x h

  • ])✷

❜✉t s✐♥❝❡ ✶ hE[K Xi − x h

  • ] ≈ f (x)

t❤❡ ✷♥❞ t❡r♠ ❛❜♦✈❡ ❞✐s❛♣♣❡❛rs ❛s N → ∞

slide-15
SLIDE 15

✶✺✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❑❡r♥❡❧ ❞❡♥s✐t② ❡st✐♠❛t♦r ✕ ❊st✐♠❛t✐♦♥ ❜✐❛s

◆♦✇ ✶ hE

  • K

Xi − x h ✷ = ✶ h

  • K

z − x h ✷ f (z)dz =

  • K(u)✷f (x + hu)du

  • K(u)✷(f (x) + f ′(x)hu + ✶

✷f ′′(x)h✷u✷)du ≈ f (x)R(K) ✇❤❡r❡ R(K) ≡

  • K(u)✷du ✐s t❤❡ ✏r♦✉❣❤♥❡ss✑ ♦❢ t❤❡ ❦❡r♥❡❧

❙♦ t❤❛t Var(ˆ f (x)) ≈ ✶ Nhf (x)R(K)

slide-16
SLIDE 16

✶✻✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❇❛♥❞✇✐❞t❤

❚♦ ❣❡t r✐❞ ♦❢ ❜✐❛s✿ ❜❛♥❞✇✐❞t❤ s❤♦✉❧❞ ❞❡❝r❡❛s❡ ❛s s❛♠♣❧❡ s✐③❡ ✐s ✐♥❝r❡❛s✐♥❣

◮ ✐♥ t❤❡ ❧✐♠✐t ✭✐♥✜♥✐t❡ s❛♠♣❧❡ s✐③❡✮ t❤❡ ❜❛♥❞✇✐❞t❤ s❤♦✉❧❞ ❜❡ ③❡r♦

✭✇❡ ❦♥♦✇ t❤❡ ❞❡♥s✐t② ❛t ❡❛❝❤ ♣♦✐♥t✮ ❚♦ ❣❡t r✐❞ ♦❢ ✈❛r✐❛♥❝❡✿ ❜❛♥❞✇✐❞t❤ s❤♦✉❧❞ ❞❡❝r❡❛s❡ ❛t ❛ s❧♦✇❡r r❛t❡ t❤❛♥ t❤❡ s❛♠♣❧❡ s✐③❡ ✐s ✐♥❝r❡❛s✐♥❣

◮ t❤❡ ♥✉♠❜❡r ♦❢ ♦❜s❡r✈❛t✐♦♥s ✇✐t❤✐♥ t❤❡ ❜❛♥❞✇✐❞t❤ ✐♥❝r❡❛s❡s

✇✐t❤ s❛♠♣❧❡ s✐③❡ ✭✈❛r✐❛♥❝❡ ♦❢ ♦✉r ❡st✐♠❛t❡ ❣♦❡s t♦ ③❡r♦✮ ❜❡❝❛✉s❡ ✇❡ r❡❞✉❝❡ t❤❡ ❜❛♥❞✇✐❞t❤ ✇❡ ❤❛✈❡ s❧♦✇❡r t❤❛♥ √ N ❝♦♥✈❡r❣❡♥❝❡

slide-17
SLIDE 17

✶✼✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

▼❡❛♥ ❙q✉❛r❡❞ ❊rr♦r ✭▼❙❊✮

❆t ❛ ♣♦✐♥t x MSE(x) ≡ E[

  • f (x) − ˆ

f (x) ✷ ] = Bias✷ + Var(ˆ f (x)) ≈ ✶ ✷f ′′(x)h✷κ✷ ✷ + ✶ Nhf (x)R(K) ≡ AMSE(x) ❙♠♦♦t❤✐♥❣ ✐♥✈♦❧✈❡s ❛ tr❛❞❡✲♦✛ ❜❡t✇❡❡♥ ❜✐❛s ❛♥❞ ✈❛r✐❛♥❝❡✿

◮ ✇❤❡♥ t❤❡ ❞❛t❛ ❛r❡ ♦✈❡r✲s♠♦♦t❤❡❞✱ t❤❡ ❜✐❛s ✐s ❧❛r❣❡ ❛♥❞ t❤❡

✈❛r✐❛♥❝❡ ❧♦✇

◮ ✇❤❡♥ t❤❡ ❞❛t❛ ❛r❡ ✉♥❞❡r✲s♠♦♦t❤❡❞✱ t❤❡ ❜✐❛s ✐s ❧♦✇ ❛♥❞ t❤❡

✈❛r✐❛♥❝❡ ❤✐❣❤ ♦♣t✐♠❛❧ s♠♦♦t❤✐♥❣ ♠✐♥✐♠✐③❡s t❤❡ r✐s❦ ✭▼❙❊✮

slide-18
SLIDE 18

✶✽✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❇❛♥❞✇✐❞t❤

❆ ❦❡② q✉❡st✐♦♥ ✐s ❤♦✇ t♦ ❝❤♦♦s❡ h❄ ❲✐t❤ t❤❡ ❤✐st♦❣r❛♠s ❛❜♦✈❡ ✇❡ s❛✇ t❤❛t ❛s h ✐♥❝r❡❛s❡❞✱ t❤❡ ❞❡♥s✐t②

◮ ❜❡❝❛♠❡ ❧❡ss ✏❥✉♠♣②✑ ◮ ❜✉t ❞✐❞ ❛ ♣♦♦r❡r ❥♦❜ ✜tt✐♥❣ t❤❡ ❞❛t❛

❚❤✐s ❤✐❣❤❧✐❣❤ts t❤❡ tr❛❞❡✲♦✛ ❜❡t✇❡❡♥ ✈❛r✐❛♥❝❡ ❛♥❞ ❜✐❛s ▼❡t❤♦❞s ❢♦r ♦♣t✐♠❛❧ ❜❛♥❞✇✐❞t❤ tr② t♦ ❜❛❧❛♥❝❡ t❤✐s ✉s✐♥❣ ✇❡❧❧ ❞❡✜♥❡❞ ❝r✐t❡r✐❛ s✉❝❤ ❛s t❤❡ ✐♥t❡❣r❛t❡❞ sq✉❛r❡ ❡rr♦r

slide-19
SLIDE 19

✶✾✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

▼❡❛♥ ❙q✉❛r❡❞ ❊rr♦r ✭▼❙❊✮

❆ ❣❧♦❜❛❧ ♠❡❛s✉r❡ ♦❢ ✜t ✐s t❤❡ ❛s②♠♣t♦t✐❝ ♠❡❛♥ ✐♥t❡❣r❛t❡❞ sq✉❛r❡ ❡rr♦r✿ AMISE =

  • AMSE(x)dx =

✶ ✷f ′′(x)h✷κ✷ ✷ + ✶ Nhf (x)R(K)

  • dx

= ✶ ✹R(f ′′)h✹κ✷

✷ + ✶

NhR(K) ❛♥❞ t❤❡ ❜❛♥❞✇✐❞t❤ t❤❛t ♠✐♥✐♠✐③❡s ✐t✿ h✵ = R(f ′′)−✶/✺(R(K)/κ✷

✷)✶/✺N−✶/✺

❯s✐♥❣ t❤✐s ❜❛♥❞✇✐❞t❤ ✇❡ ❣❡t AMISE✵ = ✺ ✹(κ✷

✷R(K)R(f ′′))✶/✺N−✹/✺

✭✇❤✐❝❤ ❝♦♥✈❡r❣❡s ❛t ❛ s❧♦✇❡r t❤❛♥ ♣❛r❛♠❡tr✐❝ r❛t❡ N−✶✮

slide-20
SLIDE 20

✷✵✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❇❛♥❞✇✐❞t❤ ✕ ❙✐❧✈❡r♠❛♥✬s ♦♣t✐♠❛❧ ❜❛♥❞✇✐❞t❤

■❢ ❜♦t❤ t❤❡ ❞❛t❛ ✭f ✮ ❛♥❞ t❤❡ ❦❡r♥❡❧ ❛r❡ ♥♦r♠❛❧✱ t❤❡♥ ❙✐❧✈❡r♠❛♥ ✭✶✾✽✻✮ s✉❣❣❡st❡❞ t❤❡ ❢♦❧❧♦✇✐♥❣ ❜❛♥❞✇✐❞t❤ hopt ≈ ✶.✵✻σN−✶/✺ ✐t ❝❛♥ ❛❧s♦ ❜❡ ❛❞❥✉st❡❞ ❜② ❛ ❢❛❝t♦r δ ✭s❡❡ t❛❜❧❡ ❛❜♦✈❡✮ ❢♦r ❞✐✛❡r❡♥t ❦❡r♥❡❧s hopt ≈ ✶.✸✻✹✸δσN−✶/✺ ♦r hopt ≈ ✶.✸✻✹✸δN−✶/✺ ♠✐♥(σ, IQR/✶.✸✹✾) ✇❤✐❝❤ ✐s ♠♦r❡ r♦❜✉st ❛❣❛✐♥st ♦✉t❧✐❡rs ■♥ ♣r❛❝t✐❝❡ t❤✐s ♠❡t❤♦❞ ✇♦r❦s q✉✐t❡ ✇❡❧❧✱ ❜✉t

◮ t❤❡r❡ ❛r❡ ♦t❤❡r ❛♣♣r♦❛❝❤❡s s✉❝❤ ❛s ❝r♦ss ✈❛❧✐❞❛t✐♦♥✱ ❛♥❞

♠❡t❤♦❞s t❤❛t ❧❡t t❤❡ ❜❛♥❞✇✐❞t❤ ✈❛r②

◮ ❞♦ ♥♦t ❢♦r❣❡t t♦ ✉s❡ ②♦✉r ❡②❡s✿ ❞♦❡s ✐t ❧♦♦❦ r❡❛s♦♥❛❜❧❡❄

slide-21
SLIDE 21

✷✶✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❊①♣❡♥❞✐t✉r❡ ♣❡r ♣✉♣✐❧

Density 2000 4000 6000 8000 10000 expenditure per pupil

Optimal bandwidth (200)

Density 2000 4000 6000 8000 10000 expenditure per pupil

Twice optimal

Density 2000 4000 6000 8000 10000 expenditure per pupil

Half optimal

Density 2000 4000 6000 8000 10000 expenditure per pupil

Four-times optimal

slide-22
SLIDE 22

✷✷✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❍✐❣❤❡r ❞✐♠❡♥s✐♦♥s

❚❤❡ ❛❜♦✈❡ ♠❡t❤♦❞ ❣❡♥❡r❛❧✐③❡s t♦ ❤✐❣❤❡r ❞✐♠❡♥s✐♦♥❛❧ ❝❛s❡s✱ ❢♦r ❡①❛♠♣❧❡ ✷✿ ˆ fHist(x✶, x✷) = ✶ Nh✷

N

  • i=✶

✶ ✹ × ✶

  • X✶i − x

h

  • < ✶
  • × ✶
  • X✷i − x

h

  • < ✶
  • t❤✐s ❣❡♥❡r❛t❡s ❛ ♥✉♠❜❡r ♦❢ ✐ss✉❡s✿

◮ s❛♠❡ ❜❛♥❞✇✐❞t❤ ✐♥ ❛❧❧ ❞✐♠❡♥s✐♦♥s❄ ◮ t❛❦❡ ❝♦rr❡❧❛t✐♦♥ ❜❡t✇❡❡♥ x✶, x✷ ✐♥t♦ ❛❝❝♦✉♥t ✭❡❧❧✐♣s✐s✮❄

❖♥❡ s♦❧✉t✐♦♥ ✐s t♦ tr❛♥s❢♦r♠ t❤❡ ❞❛t❛ ❞❛t❛ ✭❡q✉❛❧ ✈❛r✐❛♥❝❡ ❛♥❞ ♦rt❤♦❣♦♥❛❧✮ ❜❡❢♦r❡ t❤❡ ❝❛❧❝✉❧❛t✐♦♥s✱ ❡st✐♠❛t❡ t❤❡ ❞❡♥s✐t② ❛♥❞ tr❛♥s❢♦r♠ ❜❛❝❦

slide-23
SLIDE 23

✷✸✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥

❈✉rs❡ ♦❢ ❞✐♠❡♥s✐♦♥❛❧✐t②

❙✉♣♣♦s❡ ✇❡ ❤❛✈❡ n ✉♥✐❢♦r♠❧② ❞✐str✐❜✉t❡❞ ❞❛t❛ ♣♦✐♥ts ♦♥ [−✶, ✶]

◮ ❍♦✇ ♠❛♥② ♣♦✐♥ts ✐♥ [−✵.✶, ✵.✶] ❄

❙✉♣♣♦s❡ ✇❡ ❤❛✈❡ n ✉♥✐❢♦r♠❧② ❞✐str✐❜✉t❡❞ ❞❛t❛ ♣♦✐♥ts ♦♥ [−✶, ✶]k

◮ ❍♦✇ ♠❛♥② ♣♦✐♥ts ✐♥ [−✵.✶, ✵.✶]k ❄

slide-24
SLIDE 24

✷✹✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

❚❤❡ st❛♥❞❛r❞ t♦♦❧ t♦ ❡st✐♠❛t❡ t❤❡ r❡❧❛t✐♦♥s❤✐♣ ❜❡t✇❡❡♥ ❛♥ ♦✉t❝♦♠❡ y ❛♥❞ ❛♥ ❡①♣❧❛♥❛t♦r② ✈❛r✐❛❜❧❡ x ✐s ❧✐♥❡❛r r❡❣r❡ss✐♦♥ yi = xiβ + ǫi t❤✐s ❞♦❡s ♥♦t ♥❡❡❞ t♦ ❜❡ ❧✐♥❡❛r ✐♥ xi✿

◮ ♣♦❧②♥♦♠✐❛❧s ◮ s♣❧✐♥❡s ◮ ♥♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥ ✭t❤❡ ◆❛❞❛r❛②❛✲❲❛ts♦♥ ❡st✐♠❛t♦r✱ ♦r

❧♦❝❛❧ ❝♦♥st❛♥t ❡st✐♠❛t♦r✮

slide-25
SLIDE 25

✷✺✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

❊①❛♠♣❧❡

20 40 60 80 100 % satisfactory -- 4th grade math 2000 4000 6000 8000 10000

slide-26
SLIDE 26

✷✻✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

❑❡r♥❡❧ r❡❣r❡ss✐♦♥

❚❤❡ s❛♠❡ ♠❡t❤♦❞s ❢♦r ♥♦♥♣❛r❛♠❡tr✐❝ ❞❡♥s✐t② ❡st✐♠❛t✐♦♥ ❝❛♥ ❜❡ ✉s❡❞ t♦ ❡st✐♠❛t❡ ❛ r❡❣r❡ss✐♦♥ ❢✉♥❝t✐♦♥ E[Y |X = x] =

  • yf (y|X = x)dy

s✐♥❝❡ ˆ E[Y |X = x] =

  • y ˆ

fY |X(y|x)dy =

  • y

ˆ fYX(y, x) ˆ fX(x) dy ❲❡ ❦♥♦✇ ❤♦✇ t♦ ❡st✐♠❛t❡ ˆ fX(x) = ✶ Nh

N

  • i=✶

K Xi − x h

  • s♦ ✇❤❛t ✐s ❧❡❢t ✐s
  • y ˆ

fYX(y, x)dy

slide-27
SLIDE 27

✷✼✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

❑❡r♥❡❧ r❡❣r❡ss✐♦♥

❚❛❦❡ t❤❡ ❢♦❧❧♦✇✐♥❣ ❜✐✈❛r✐❛t❡ ❦❡r♥❡❧ K(u, v) = K✶(u)K✷(v) t❤❡♥ ˆ fYX(y, x) = ✶ Nh✷

N

  • i=✶

K Xi − x h , Yi − y h

  • =

✶ Nh✷

N

  • i=✶

K✶ Xi − x h

  • K✷

Yi − y h

  • s♦ t❤❛t
  • y ˆ

fYX(y, x)dy = ✶ Nh✷

  • y

N

  • i=✶

K✶ Xi − x h

  • K✷

Yi − y h

  • dy

= ✶ Nh

N

  • i=✶

K✶ Xi − x h y ✶ hK✷ Yi − y h

  • dy

= ✶ Nh

N

  • i=✶

K✶ Xi − x h

  • Yi
slide-28
SLIDE 28

✷✽✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

❑❡r♥❡❧ r❡❣r❡ss✐♦♥

❲❡ ❝❛♥ ♥♦✇ ✇r✐t❡ ˆ g(x) =

  • y ˆ

fYX(y, x)dy ˆ fX(x) =

✶ Nh

N

i=✶ YiK✶

  • Xi−x

h

Nh

N

i=✶ K✶

  • Xi−x

h

  • =

N

  • i=✶

ωh(Xi, x)Yi ✇❤❡♥ K(x) = ✶

✷ · ✶[x − h < Xi < x + h] t❤❡♥ ˆ

g(x) ✐s t❤❡ ❛✈❡r❛❣❡ Y ❢♦r ♦❜s❡r✈❛t✐♦♥s ✇✐t❤✐♥ ❛ ✇✐♥❞♦✇ h ♦❢ x ❚❤✐s ✐s t❤❡ ◆❛❞❛r❛②❛✲❲❛ts♦♥ ❦❡r♥❡❧ r❡❣r❡ss✐♦♥ ❡st✐♠❛t♦r ❇❛♥❞✇✐❞t❤ ✐s s♦♠❡t✐♠❡s ❝❤♦s❡♥ ✉s✐♥❣ ❝r♦ss ✈❛❧✐❞❛t✐♦♥ hopt = ❛r❣ ♠✐♥

h N

  • i=✶

(ˆ gh,(−i) − Yi)✷ ✇❤❡r❡ ˆ gh,(−i) ✐s t❤❡ r❡❣r❡ss✐♦♥ ❡st✐♠❛t❡ ❧❡❛✈✐♥❣ ♦✉t ♦❜s❡r✈❛t✐♦♥ i

slide-29
SLIDE 29

✷✾✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

▲♦❝❛❧ ❝♦♥st❛♥t r❡❣r❡ss✐♦♥

❚❤❡ st❛♥❞❛r❞ ◆❛❞❛r❛②❛✲❲❛ts♦♥ ❦❡r♥❡❧ r❡❣r❡ss✐♦♥ ❡st✐♠❛t♦r ❝❛♥ ❛❧s♦ ❜❡ s❡❡♥ ❛s ✜tt✐♥❣ ❛ ❝♦♥st❛♥t ❢✉♥❝t✐♦♥ g(x) = α ✇❤❡r❡ ˆ α = ❛r❣ ♠✐♥

a N

  • i=✶

K Xi − x h

  • (Yi − a)✷

❇✉t ❦❡r♥❡❧ r❡❣r❡ss✐♦♥ ❞♦❡s ♥♦t ♣❡r❢♦r♠ ✇❡❧❧ ❛t t❤❡ ❜♦✉♥❞❛r✐❡s ■❢ t❤❡ r❡❣r❡ss✐♦♥ ❢✉♥❝t✐♦♥ ✐s ✢❛t t❤❡r❡ ✐s ♥♦ ♣r♦❜❧❡♠✱ ❜✉t ♦t❤❡r✇✐s❡ ✇❡

◮ ♦✈❡r ✭✉♥❞❡r✮ ❡st✐♠❛t❡ ✐❢ t❤❡ r❡❣r❡ss✐♦♥ ❢✉♥❝t✐♦♥ ✐s ❝♦♥✈❡①

✭❝♦♥❝❛✈❡✮

◮ ♦✈❡r ✭✉♥❞❡r✮ ❡st✐♠❛t❡ t❤❡ r❡❣r❡ss✐♦♥ ❢✉♥❝t✐♦♥ ❛t t❤❡ ❧❡❢t ✭r✐❣❤t✮

❜♦✉♥❞❛r②

◮ ❣❡t ❜✐❛s ✐❢ ❞❡♥s✐t② ❤❛s ♥♦ ③❡r♦ ❞❡r✐✈❛t✐✈❡ ✭❞❛t❛ ♥♦t ❡q✉❛❧❧②

s♣❛❝❡❞✮

slide-30
SLIDE 30

✸✵✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

❇✐❛s ♦❢ ❦❡r♥❡❧ r❡❣r❡ss✐♦♥

slide-31
SLIDE 31

✸✶✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

▲♦❝❛❧ ❧✐♥❡❛r r❡❣r❡ss✐♦♥

▲♦❝❛❧ ❧✐♥❡❛r r❡❣r❡ss✐♦♥ ✜ts g(x) = α + β(z − x) s♦ (ˆ α, ˆ β) = ❛r❣ ♠✐♥

a, b N

  • i=✶

K Xi − x h

  • (Yi − a − b(Xi − x))✷

❛♥❞ ˆ g(x) = ˆ α + ˆ β(x − x) = ˆ α

slide-32
SLIDE 32

✸✷✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

P❛rt✐❛❧ ❧✐♥❡❛r r❡❣r❡ss✐♦♥ ✭❘♦❜✐♥s♦♥✱ ❊❝♦♥♦♠❡tr✐❝❛ ✶✾✽✽✮

❲❡ ❝❛♥ ✇r✐t❡ yi = Xiβ + g(Zi) + ei s♦ t❤❛t E[yi|Zi] = E[Xi|Zi]β + g(Zi) ❛♥❞ t❛❦✐♥❣ t❤❡ ❞✐✛❡r❡♥❝❡ yi − E[yi|Zi]

  • eyi

= (Xi − E[Xi|Zi]

  • exi

)β + ei ✇❡ ❝❛♥ ♥♦✇ ❡st✐♠❛t❡ E[yi|Zi] ❛♥❞ E[Xi|Zi] ✉s✐♥❣ ❦❡r♥❡❧ r❡❣r❡ss✐♦♥ ❡st✐♠❛t❡ β ❢r♦♠ ❛ r❡❣r❡ss✐♦♥ ♦❢ ˆ eyi ♦♥ ˆ exi ❛♥❞ t❤❡♥ ❡st✐♠❛t❡ g(z) ✉s✐♥❣ ❛ ❦❡r♥❡❧ r❡❣r❡ss✐♦♥ ♦❢ (yi − Xi ˆ β) ♦♥ Zi

slide-33
SLIDE 33

✸✸✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

P❛rt✐❛❧ ❧✐♥❡❛r r❡❣r❡ss✐♦♥

❆❧t❡r♥❛t✐✈❡❧② ✶✳ ❙♦rt t❤❡ ❞❛t❛ ❜② z ✷✳ ❋✐rst ❞✐✛❡r❡♥❝❡ y ❛♥❞ X ❛♥❞ ❡st✐♠❛t❡ β ✉s✐♥❣ ❖▲❙ ♦♥ t❤❡ ✜rst ❞✐✛❡r❡♥❝❡❞ ❞❛t❛ ✸✳ ❈❛❧❝✉❧❛t❡ ˆ e = y − X ˆ β ✹✳ ❊st✐♠❛t❡ g(z) ✉s✐♥❣ ❛ ▲▲❘ ♦❢ ˆ e ♦♥ z ❙❡❡ ❨❛t❝❤❡✇ ✭❏❊▲ ✸✻✭✷✮✱ ✶✾✾✽✮ ❢♦r ♠♦r❡ ❞❡t❛✐❧s ❛♥❞ ♠♦r❡ ❡✣❝✐❡♥t ❡st✐♠❛t♦rs

slide-34
SLIDE 34

✸✹✴✸✹

◆♦♥✲♣❛r❛♠❡tr✐❝ r❡❣r❡ss✐♦♥

❈♦♥✜❞❡♥❝❡ ✐♥t❡r✈❛❧s

❙t❛t✐st✐❝❛❧ ♣❛❝❦❛❣❡s ♦❢t❡♥ ✐♠♣❧❡♠❡♥t ❛s②♠♣t♦t✐❝ ❈■s✱ ♦r ✉s❡ t❤❡ ❜♦♦tstr❛♣

♣❝t✐❧❡ ❴① ❂ ❡①♣♣ ✱ ♥q✉❛♥t ✭✶✵✵✮ ❣❡♥ ❴✇ ❂ ✳ s❡t s❡❡❞ ✸✷✹✷✸ ❧♣♦❧② ♠❛t❤✹ ❡①♣♣ ✱ ❞❡❣r❡❡ ✭✶✮ ❛t✭❴①✮ ❣❡♥✭❜✵✮ ❢♦r✈ r❂✶✴✶✾✾ ④ ❞✐ ✳ ❴❝ ✐❢ ✭♠♦❞✭❵r✬✱ ✺✵✮ ❂❂ ✵✮ ❞✐ ✧ ❵❂ ✺✵ ✯ ✐♥t✭❵r✬ ✴ ✺✵✮ ✬✧ ❜s❛♠♣❧❡ ✱ ✇❡✐❣❤t✭❴✇✮ ❧♣♦❧② ♠❛t❤✹ ❡①♣♣ ❬❢✇❂❴✇❪✱ ❞❡❣r❡❡ ✭✶✮ ❛t✭❴①✮ ❣❡♥✭❴❜ ❵r✬✮ ♥♦❣r❛♣❤ ⑥ ❡❣❡♥ ❝✐❴❧♦✇❡r ❂ r♦✇♣❝t✐❧❡✭❴❜✯✮✱ ♣✭✺✮ ❡❣❡♥ ❝✐❴✉♣♣❡r ❂ r♦✇♣❝t✐❧❡✭❴❜✯✮✱ ♣✭✾✺✮ s♦rt ❴① t✇♦✇❛② ✭r❛r❡❛ ❝✐✯ ❴① ✐❢ ❴① ❁ ✳✮ ✭❧✐♥❡ ❜✵ ❴① ✐❢ ❴① ❁ ✳✮