Gaussian Processes for Active Sensor Management Alexander N. Dolia , - - PowerPoint PPT Presentation

gaussian processes for active sensor management
SMART_READER_LITE
LIVE PREVIEW

Gaussian Processes for Active Sensor Management Alexander N. Dolia , - - PowerPoint PPT Presentation

OED for KRR, and the MVCE Gaussian Processes for Active Sensor Management Alexander N. Dolia , University of Southampton This poster is based on A.N.Dolia, C.J.Harris, J.Shawe-Taylor, D.M.Titterington, Kernel Ellipsoidal Trimming , submitted


slide-1
SLIDE 1

OED for KRR, and the MVCE

Gaussian Processes for Active Sensor Management

Alexander N. Dolia, University of Southampton This poster is based on

  • A.N.Dolia, C.J.Harris, J.Shawe-Taylor, D.M.Titterington, Kernel Ellipsoidal Trimming,

submitted to the Special Issue of the Journal Computational Statistics and Data Analysis

  • n Machine Learning and Robust Data Mining. under review.
  • A.N.Dolia, T.De Bie, C.J.Harris, J.Shawe-Taylor, D.M.Titterington. Optimal experimental

design for kernel ridge regression, and the minimum volume covering ellipsoid, Workshop on Optimal Experimental Design, Southampton, 22-26 September, 2006 Joint work with: Dr. Tijl De Bie, Katholieke Universiteit Leuven

  • Prof. John Shawe-Taylor, University of Southampton
  • Prof. Chris Harris, University of Southampton
  • Prof. Mike Titterington, University of Glasgow
slide-2
SLIDE 2

OED for KRR, and the MVCE

Problem Statement

x1 x2 x3

Aim is to estimate locations of the senso rs and number of repetitions given a set of possible sensors locations, cost of measurements and upper bound for the number of repetitions at given sensor locations in order to get good prediction f(x)

  • Sensor network: N sensors measure signals at positions xi
  • Sensors measure function yi = f(xi) = x0

iw + ni

  • Weight vector w gives information about ‘system’
  • Position sensors optimally at XD
  • Estimate w based on XD
slide-3
SLIDE 3

OED for KRR, and the MVCE

Optimal experiment design?

yi

x0

iw

xi

Optimal experiment design (OED) idea:

  • Given a set of n data points X = {xi}
  • Choose multisetXD = {xD,i} ⊆ X with N data points, Ni times xi
  • Measure at xD,i → yD = {yD,i} with yD,i = x0

D,iw + ni

  • Estimate w based on {XD,yD} → ˆ

w

slide-4
SLIDE 4

OED for KRR, and the MVCE

Optimal experiment design for RR

  • Result is thus a non-convex optimization problem:

minα − logdet

⎛ ⎜ ⎝ X i

αixix0

i + γI + 1

4γ2

⎛ ⎝X i

αixix0

i ⎞ ⎠ −1⎞ ⎟ ⎠

s.t. α0e = 1 α ≥ 0

  • Minimize tight upper bound:

α∗

γ = argminα

−logdet

⎛ ⎝X i

αixix0

i + γI ⎞ ⎠

s.t. α0e = 1 α ≥ 0

  • This is a convex optimization problem again
slide-5
SLIDE 5

OED for KRR, and the MVCE

Regularized MVCE

  • What about the dual of the regularized D-OED?

minM,μ logdet(M) + μ + γtrace(M−1) s.t.

x0

iM−1xi <= μ

  • The optimum is given by:

M∗

γ = X i

α∗

γ,ixix0 i + γI

where α∗

γ is the solution of the regularized D-OED problem

  • Interpretation: trace(M−1) =

P i 1 λi → fit an ellipsoid, but

make sure none of the eigenvalues of M∗

γ is too small.. .

slide-6
SLIDE 6

OED for KRR, and the MVCE

Kernel ridge regression (KRR)

Least squares Ridge regression Kernel RR

  • Kernel ridge regression (KRR):

KD = XDX0

D

β = (KD + e γI)−1y ˆ

wRR = X0

Dβ = X i

βixD,i f(x) = x0ˆ

wRR =

X i

βix0xD,i =

X i

βik(x,xD,i)

  • Everything expressed in terms of KD (i.e. in terms of inner

products/kernels): ‘kernel trick’

  • If we want to do OED for KRR, we need to write it entirely

in terms of kernel evaluations/innerproducts—can we?

slide-7
SLIDE 7

OED for KRR, and the MVCE

Kernel MVCE

  • Mahalanobis distances x0(P

i α∗ γ,ixix0 i + γI)−1x in terms of in-

ner products/kernel evaluations?

  • Let AKA = VΛV0 (eigenvaluedecomposition), then (deriva-

tion not shown. . . ):

x0(

X i

α∗

γ,ixix0 i + γI)−1x

= 1 γ

³

x0x − x0X0AVΛ(Λ + γI)−1V0AXx

´

  • Express in terms of k(x,x) = x0x and k = Xx, then:

x0(

X i

α∗

γ,ixix0 i + γI)−1x

= 1 γ

³

k(x,x) − k0AVΛ(Λ + γI)−1V0Ak

´

completely expressed in terms of kernels

Novelty detection MVCE and duality Regularized MVCE Kernel MVCE

slide-8
SLIDE 8

OED for KRR, and the MVCE

OED: summary

minα − logdet

⎛ ⎝X i

αixix0

i ⎞ ⎠

s.t. α01 = 1 α ≥ 0 minα − logdet

⎛ ⎝X i

αixix0

i + γI ⎞ ⎠

s.t. α01 = 1 α ≥ 0 mina − logdet(AKA + γI) s.t.

a0a <= 1 a ≥ 0

minM,μ logdet (M) + μ s.t.

x0

iM−1xi <= μ

minM,μ logdet(M) + μ + γtrace(M−1) s.t.

x0

iM−1xi <= μ

D-OED MVCE standard regularized kernel

slide-9
SLIDE 9

OED for KRR, and the MVCE

Experiment

slide-10
SLIDE 10

OED for KRR, and the MVCE

Generalised D-optimal Experimental Design

slide-11
SLIDE 11

OED for KRR, and the MVCE

Conclusions

  • Two seemingly very different algorithms within one optimiza-

tion framework

  • A way to perform optimal experimental design in high dimen-

sional spaces, such as kernel induced feature spaces

  • A way to perform minimum volume covering ellipsoidestima-

tion in high dimensional spaces to perform novelt y detection

  • Nice features: Convex optimisation and sparse solution