cs475 cs675 lecture 24 july 21 2016
play

CS475/CS675 Lecture 24: July 21, 2016 Open problems CS475/CS675 - PowerPoint PPT Presentation

CS475/CS675 Lecture 24: July 21, 2016 Open problems CS475/CS675 (c) 2016 P. Poupart 1 Two Open Problems Kernel methods: how to solve linear systems of equation in less than cubic time Markov decision processes: how to evaluate


  1. CS475/CS675 Lecture 24: July 21, 2016 Open problems CS475/CS675 (c) 2016 P. Poupart 1

  2. Two Open Problems • Kernel methods: how to solve linear systems of equation in less than cubic time • Markov decision processes: how to evaluate factored policies in less than exponential time CS475/CS675 (c) 2016 P. Poupart 2

  3. Kernel Methods • Class of non‐parametric Machine Learning techniques that scale with the amount of data • Examples: – Gaussian processes – Support vector machines – Kernel logistic regression – Kernel principal component analysis – Kernel perceptron CS475/CS675 (c) 2016 P. Poupart 3

  4. Gaussian Process • Quick recall: – Non‐parametric regression – Infinite dimensional Gaussian • Picture: CS475/CS675 (c) 2016 P. Poupart 4

  5. Kernel • Covariance function is a kernel function � �, � � � � � � ��� � � • Where is the feature function that defines the kernel • Popular kernels with infinitely many features: � ���� � � Gaussian kernel: � ���� � � Exponential kernel: � CS475/CS675 (c) 2016 P. Poupart 5

  6. Common problem • In all kernel methods, a linear system of equations must be solved: • is an instantiation of the kernel function called the � � Gram matrix, i.e. �,�� • is a constant positive scalar • is constant vector • is the vector of unknowns CS475/CS675 (c) 2016 P. Poupart 6

  7. Challenge • is an matrix where is the number of data points in the dataset � time to solve • Linear system takes • This does not scale to large datasets, i.e., millions or billions of data points. � or less? • How can we reduce the time to CS475/CS675 (c) 2016 P. Poupart 7

  8. Properties • Gram matrix is – Symmetric – Positive semi‐definite – We also know the feature function that is � � � used to create • Can you exploit those properties to reduce � or less? the solution complexity to CS475/CS675 (c) 2016 P. Poupart 8

  9. Markov Decision Processes • Popular model in Operations Research and Artificial Intelligence for decision‐theoretic planning Agent State Action Reward Environment a0 a1 a2 … s0 s1 s2 r1 r2 r0 9 CS475/CS675 (c) 2016 P. Poupart

  10. Markov Decision Processes Formally: Set of states � , set of actions � , discount � ∈ �0,1� Transition function � �, �, � � � Pr �� � |�, �� Reward function � �, � ∈ � a 1 a 0 a 3 a 2 s 0 s 1 s 2 s 4 s 3 r 2 r 3 r 4 r 1 CS475/CS675 (c) 2016 P. Poupart 10

  11. Policy • Policy (mapping from states to actions) • Let be the number of states � ( • Transition matrix: ) � ( • Reward vector: ) CS475/CS675 (c) 2016 P. Poupart 11

  12. Value Function � • Value � of a policy at state � : � � � � � R � s � Pr � � � � , � � � �� � � �� ∑ � � �� � ∑ � � �� � � Pr � � � � , � ∑ Pr � � � � , � � � � � �� � ∑ � � �� � � Pr � � � � , � ∑ ∑ Pr � � � � , � Pr � � � � , � � � � � � � � ⋯ CS475/CS675 (c) 2016 P. Poupart 12

  13. Bellman’s Equation • Recursive formula: � � � � � R � s � � � � Pr � � � � , � � � �� � � � � • Matrix form: � � � � � � �� � � � • Solution: system of linear equation � � �� � � � � � � CS475/CS675 (c) 2016 P. Poupart 13

  14. Problem • Let be the number of states � is • Transition matrix � which is prohibitive for large state • Time spaces CS475/CS675 (c) 2016 P. Poupart 14

  15. Factored MDP • Let be the number of binary features • Each state corresponds to all combinations of binary features � states • This yields �� which is exponential in the number of • Time features • Challenge: can we reduce the solution to be polynomial in ? CS475/CS675 (c) 2016 P. Poupart 15

  16. Factored MDP • Factored transition matrix � � � � � � � � � � � � � � ��� • Additive reward function � � � � � � � ��� CS475/CS675 (c) 2016 P. Poupart 16

  17. Properties • Factored MDP � sum to 1 – Rows of � is 1 – Largest eigenvalue of � is factored and � is additive – • Can you exploit those properties to reduce the time complexity to be polynomial in ? CS475/CS675 (c) 2016 P. Poupart 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend