CS475/CS675 Lecture 24: July 21, 2016 Open problems CS475/CS675 - - PowerPoint PPT Presentation

cs475 cs675 lecture 24 july 21 2016
SMART_READER_LITE
LIVE PREVIEW

CS475/CS675 Lecture 24: July 21, 2016 Open problems CS475/CS675 - - PowerPoint PPT Presentation

CS475/CS675 Lecture 24: July 21, 2016 Open problems CS475/CS675 (c) 2016 P. Poupart 1 Two Open Problems Kernel methods: how to solve linear systems of equation in less than cubic time Markov decision processes: how to evaluate


slide-1
SLIDE 1

CS475/CS675 Lecture 24: July 21, 2016

Open problems

CS475/CS675 (c) 2016 P. Poupart 1

slide-2
SLIDE 2

2

Two Open Problems

  • Kernel methods: how to solve linear systems of

equation in less than cubic time

  • Markov decision processes: how to evaluate

factored policies in less than exponential time

CS475/CS675 (c) 2016 P. Poupart

slide-3
SLIDE 3

Kernel Methods

  • Class of non‐parametric Machine Learning

techniques that scale with the amount of data

  • Examples:

– Gaussian processes – Support vector machines – Kernel logistic regression – Kernel principal component analysis – Kernel perceptron

CS475/CS675 (c) 2016 P. Poupart 3

slide-4
SLIDE 4

Gaussian Process

  • Quick recall:

– Non‐parametric regression – Infinite dimensional Gaussian

  • Picture:

CS475/CS675 (c) 2016 P. Poupart 4

slide-5
SLIDE 5

Kernel

  • Covariance function is a kernel function

,

  • Where

is the feature function that defines the kernel

  • Popular kernels with infinitely many features:

Gaussian kernel:

  • Exponential kernel:
  • CS475/CS675 (c) 2016 P. Poupart

5

slide-6
SLIDE 6

Common problem

  • In all kernel methods, a linear system of equations

must be solved:

  • is an instantiation of the kernel function called the

Gram matrix, i.e.

,

  • is a constant positive scalar
  • is constant vector
  • is the vector of unknowns

CS475/CS675 (c) 2016 P. Poupart 6

slide-7
SLIDE 7

Challenge

  • is an

matrix where is the number of data points in the dataset

  • Linear system takes

time to solve

  • This does not scale to large datasets, i.e.,

millions or billions of data points.

  • How can we reduce the time to
  • r less?

CS475/CS675 (c) 2016 P. Poupart 7

slide-8
SLIDE 8

Properties

  • Gram matrix

is

– Symmetric – Positive semi‐definite – We also know the feature function that is used to create

  • Can you exploit those properties to reduce

the solution complexity to

  • r less?

CS475/CS675 (c) 2016 P. Poupart 8

slide-9
SLIDE 9

Markov Decision Processes

CS475/CS675 (c) 2016 P. Poupart 9

  • Popular model in Operations Research and Artificial

Intelligence for decision‐theoretic planning

Agent Environment

State Reward Action s0 s1 s2 r0 a0 a1 r1 r2 a2 …

slide-10
SLIDE 10

Markov Decision Processes

CS475/CS675 (c) 2016 P. Poupart 10

Formally:

Set of states , set of actions , discount ∈ 0,1 Transition function , , Pr |, Reward function , ∈ s0 s1 s2 s3 s4 a0 a1 a2 a3 r1 r2 r3 r4

slide-11
SLIDE 11

Policy

  • Policy

(mapping from states to actions)

  • Let

be the number of states

  • Transition matrix:

(

)

  • Reward vector:

(

)

CS475/CS675 (c) 2016 P. Poupart 11

slide-12
SLIDE 12

Value Function

  • Value
  • f a policy

at state :

R s ∑ Pr ,

Pr , ∑ Pr ,

Pr , ∑ Pr ,

Pr ,

CS475/CS675 (c) 2016 P. Poupart 12

slide-13
SLIDE 13

Bellman’s Equation

  • Recursive formula:

R s Pr ,

  • Matrix form:
  • Solution: system of linear equation

CS475/CS675 (c) 2016 P. Poupart 13

slide-14
SLIDE 14

Problem

  • Let

be the number of states

  • Transition matrix

is

  • Time

which is prohibitive for large state

spaces

CS475/CS675 (c) 2016 P. Poupart 14

slide-15
SLIDE 15

Factored MDP

  • Let

be the number of binary features

  • Each state corresponds to all combinations of binary

features

  • This yields

states

  • Time

which is exponential in the number of

features

  • Challenge: can we reduce the solution to be

polynomial in ?

CS475/CS675 (c) 2016 P. Poupart 15

slide-16
SLIDE 16

Factored MDP

  • Factored transition matrix
  • Additive reward function
  • CS475/CS675 (c) 2016 P. Poupart

16

slide-17
SLIDE 17

Properties

  • Factored MDP

– Rows of

sum to 1

– Largest eigenvalue of

is 1

is factored and is additive

  • Can you exploit those properties to reduce

the time complexity to be polynomial in ?

CS475/CS675 (c) 2016 P. Poupart 17