Pool-based Agnostic Pool-based Agnostic Experiment Design - - PowerPoint PPT Presentation

pool based agnostic pool based agnostic experiment design
SMART_READER_LITE
LIVE PREVIEW

Pool-based Agnostic Pool-based Agnostic Experiment Design - - PowerPoint PPT Presentation

ECML2008 Sep. 15-19, 2008 Pool-based Agnostic Pool-based Agnostic Experiment Design Experiment Design in Linear Regression in Linear Regression Masashi Sugiyama (Tokyo Tech.) Shinichi Nakajima (Nikon) 2 Linear Regression Linear


slide-1
SLIDE 1
  • Sep. 15-19, 2008

ECML2008

Pool-based Agnostic Experiment Design in Linear Regression Pool-based Agnostic Experiment Design in Linear Regression

Masashi Sugiyama (Tokyo Tech.) Shinichi Nakajima (Nikon)

slide-2
SLIDE 2

2

Linear Regression Linear Regression

Learn a real-valued function from input-output training samples . input

  • utput
slide-3
SLIDE 3

3

Linear Regression (cont.) Linear Regression (cont.)

Linear model is used for learning: Goal: learn so that the generalization error is minimized

: Parameter : Basis function :Test input density : Expectation over noise

slide-4
SLIDE 4

4

Experiment Design Experiment Design

Quality of learned functions depends

  • n training input location

. Goal: optimize training input location

Good input location Poor input location

Learned Target

slide-5
SLIDE 5

5

Challenges Challenges

  • is unknown and needs to be estimated.

In experiment design, we do not have training output values yet. Thus we cannot use, e.g., cross-validation which requires . Only training input positions can be used in generalization error estimation!

slide-6
SLIDE 6

6

Organization Organization

  • 1. Problem definition
  • 2. Basic strategy
  • 3. Proposed method
  • 4. Experiments

Pool-based Agnostic Experiment Design in Linear Regression

slide-7
SLIDE 7

7

Bias and Variance Bias and Variance

  • is not estimable without .

For linear learning :

Noise variance is not estimable without .

  • is computable from .

: Learning matrix

slide-8
SLIDE 8

8

Key Trick in Experiment Design Key Trick in Experiment Design

Find a setup where is guaranteed. Then Thus

: Learning matrix computable before

  • bserving
slide-9
SLIDE 9

9

Traditional Method Traditional Method

Assume model is correct: Use ordinary least squares (OLS) estimation: Experiment design criterion:

(Fedorov 1972)

slide-10
SLIDE 10

10

Goal of This Work Goal of This Work

Pros / cons of traditional method:

+ Generalization error estimation is exact. + Easy to implement.

  • Correct-model assumption is not realistic.
  • Very poor performance when agnostic.
  • Test input density is often unknown.

We propose a new method that is

Still easy to implement, Robust against agnosticity, Able to work without .

slide-11
SLIDE 11

11

Organization Organization

  • 1. Problem definition
  • 2. Basic strategy
  • 3. Proposed method

1.

Overcoming agnosticity

2.

Coping with pool-based setup

  • 4. Experiments

Pool-based Agnostic Experiment Design in Linear Regression

slide-12
SLIDE 12

12

Weak Agnostic Setup Weak Agnostic Setup

The model is not exactly correct, but is reasonably good: Decomposition of target function:

Approximable part Residual part Target function

slide-13
SLIDE 13

13

  • is constant and ignorable.

But OLS cannot make zero due to “covariate shift”:

Training / test inputs follow different distributions. “Covariate” is another name for “input”.

Further Decomposition of Bias Further Decomposition of Bias

(Shimodaira JSPI2000)

slide-14
SLIDE 14

14

Importance-Weighted LS (IWLS) Importance-Weighted LS (IWLS)

Even when agnostic: When weak agnostic: Solution is given by

Importance

slide-15
SLIDE 15

15

Justification Justification

For IWLS Thus

(Sugiyama JMLR2006) computable before

  • bserving
slide-16
SLIDE 16

16

Organization Organization

  • 1. Problem definition
  • 2. Basic strategy
  • 3. Proposed method

1.

Overcoming agnosticity

2.

Coping with pool-based setup

  • 4. Experiments

Pool-based Agnostic Experiment Design in Linear Regression

slide-17
SLIDE 17

17

Pool-based Setup Pool-based Setup

Pool-based setup:

The test input density

is unknown.

But a pool of test input samples is given. Training input points are chosen from the pool.

We assume .

Importance weight is not accessible.

slide-18
SLIDE 18

18

Computing Importance Weight Computing Importance Weight

  • : Resampling probability of

Choose following . Then can be exactly computed:

slide-19
SLIDE 19

19

Proposed Method Proposed Method

Choose resampling function based on Advantages:

Robust against model misspecification. Easy to implement.

slide-20
SLIDE 20

20

Organization Organization

  • 1. Problem definition
  • 2. Basic strategy
  • 3. Proposed method
  • 4. Experiments

Pool-based Agnostic Experiment Design in Linear Regression

slide-21
SLIDE 21

21

Wafer Alignment in Semiconductor Exposure Apparatus Wafer Alignment in Semiconductor Exposure Apparatus

Recent silicon wafers have layer structure. Circuit patterns are exposed multiple times. Exact alignment of wafers is very important.

slide-22
SLIDE 22

22

Markers on Wafer Markers on Wafer

Wafer alignment process:

Measure marker location printed on wafers. Shift and rotate the wafer to minimize the gap.

For speeding up, reducing the number of markers to measure is very important.

slide-23
SLIDE 23

23

Non-linear Alignment Model Non-linear Alignment Model

When gap is only shift and rotation, linear model is exact: However, non-linear factors exist, e.g.,

Warp Biased characteristic of measurement apparatus Different temperature conditions

Exactly modeling non-linear factors is very difficult in practice!

slide-24
SLIDE 24

24

Experimental Results Experimental Results

Proposed method works the best!

20 markers (out of 38) are chosen by experiment design methods. Gaps of all markers are predicted. Repeated for 220 different wafers. Mean (standard deviation) of the gap prediction error Red: Significantly better by 5% Wilcoxon test Blue: Worse than the baseline passive method

2.13(1.08) 2.36(1.15) “Outer” heuristic 2.32(1.15) 1.96(0.91) 1.93(0.89) Order 2 2.32(1.11) 2.37(1.15) 2.27(1.08) Order 1 Passive (Random) Pool / Non-agnostic (Fedorov 1972) Pool / Agnostic (Proposed) Model

Order 1: Order 2:

slide-25
SLIDE 25

25

Conclusions Conclusions

We proposed a pool-based agnostic experiment design method for linear regression. Proposed method is

Robust against model misspecification, Easy to implement.

Proposed method is promising in

Extensive benchmark simulations, Real-world wafer alignment task.

Come to our poster for technical details!