A subexponential lower bound for Zadehs pivoting rule for solving - PowerPoint PPT Presentation

Games and Policy Iteration Policies and corresponding values A policy π is a choice of an action from each state. The value val π ( i ) of a state i ∈ S for 6 t a policy π , is the expected sum of rewards obtained when moving 1 according to π , starting from i . 3 2 3 2 0 An action is an improving switch w.r.t. π if it improves the values. -4 0 -1 -1 Oliver Friedmann (LMU) Zadeh Lower Bound 9

Games and Policy Iteration Policies and corresponding values A policy π is a choice of an action from each state. The value val π ( i ) of a state i ∈ S for 6 t a policy π , is the expected sum of rewards obtained when moving 1 according to π , starting from i . 3 2 3 2 0 An action is an improving switch w.r.t. π if it improves the values. It suffices to check whether an action -4 0 is improving for one step w.r.t. the current values. -1 -1 Oliver Friedmann (LMU) Zadeh Lower Bound 9

Games and Policy Iteration Policies and corresponding values A policy π is a choice of an action from each state. The value val π ( i ) of a state i ∈ S for 6 t a policy π , is the expected sum of rewards obtained when moving 1 according to π , starting from i . 3 2 3 6 6 An action is an improving switch w.r.t. π if it improves the values. It suffices to check whether an action -4 0 is improving for one step w.r.t. the current values. -1 -1 Oliver Friedmann (LMU) Zadeh Lower Bound 9

Games and Policy Iteration Policies and corresponding values A policy π is a choice of an action from each state. The value val π ( i ) of a state i ∈ S for 6 t a policy π , is the expected sum of rewards obtained when moving 1 according to π , starting from i . 3 2 3 6 6 An action is an improving switch w.r.t. π if it improves the values. It suffices to check whether an action -4 2 is improving for one step w.r.t. the current values. 1 -1 Oliver Friedmann (LMU) Zadeh Lower Bound 9

Games and Policy Iteration Policies and corresponding values A policy π is a choice of an action from each state. The value val π ( i ) of a state i ∈ S for 6 t a policy π , is the expected sum of rewards obtained when moving 1 according to π , starting from i . 3 2 3 6 6 An action is an improving switch w.r.t. π if it improves the values. It suffices to check whether an action -4 2 is improving for one step w.r.t. the current values. A policy π ∗ is optimal iff there are no 2 improving switches. Optimal policies -1 simultaneously maximize the values of all states. Oliver Friedmann (LMU) Zadeh Lower Bound 9

Games and Policy Iteration MDPs and linear programming No improving switches for optimal policy π ∗ : � ∀ i ∈ S : val π ∗ ( i ) = max a ∈ A i r a + p a,j val π ∗ ( j ) j ∈ S where A i is the set of actions from state i , r a is the expected reward of using action a , and p a,j is the probability of moving to state j when using action a . Oliver Friedmann (LMU) Zadeh Lower Bound 10

Games and Policy Iteration MDPs and linear programming No improving switches for optimal policy π ∗ : � ∀ i ∈ S : val π ∗ ( i ) = max a ∈ A i r a + p a,j val π ∗ ( j ) j ∈ S where A i is the set of actions from state i , r a is the expected reward of using action a , and p a,j is the probability of moving to state j when using action a . This can be used to formulate an LP for solving the MDP: � minimize v i i ∈ S � s.t. ∀ i ∈ S ∀ a ∈ A i : v i ≥ r a + p a,j v j j ∈ S Oliver Friedmann (LMU) Zadeh Lower Bound 10

Games and Policy Iteration Primal and dual LPs for MDPs � minimize v i i ∈ S � s.t. ∀ i ∈ S ∀ a ∈ A i : v i ≥ r a + p a,j v j j ∈ S � � maximize r a x a i ∈ S a ∈ A i � � � s.t. ∀ i ∈ S : x a = 1 + p a,i x a a ∈ A i j ∈ S a ∈ A j Oliver Friedmann (LMU) Zadeh Lower Bound 11

Games and Policy Iteration Primal and dual LPs for MDPs Flow conservation: � minimize v i i ∈ S x 1 = 1 x 2 = 6 � s.t. ∀ i ∈ S ∀ a ∈ A i : v i ≥ r a + p a,j v j j ∈ S � � maximize r a x a x 3 = 4 x 4 = 2 i ∈ S a ∈ A i � � � s.t. ∀ i ∈ S : x a = 1 + p a,i x a x 1 + x 2 = 1 + x 3 + x 4 a ∈ A i j ∈ S a ∈ A j Oliver Friedmann (LMU) Zadeh Lower Bound 11

Games and Policy Iteration Primal and dual LPs for MDPs Flow conservation: � minimize v i i ∈ S x 1 = 7 x 2 = 0 � s.t. ∀ i ∈ S ∀ a ∈ A i : v i ≥ r a + p a,j v j j ∈ S � � maximize r a x a x 3 = 4 x 4 = 2 i ∈ S a ∈ A i � � � s.t. ∀ i ∈ S : x a = 1 + p a,i x a x 1 + x 2 = 1 + x 3 + x 4 a ∈ A i j ∈ S a ∈ A j Every basic feasible solution corresponds to a policy π . Oliver Friedmann (LMU) Zadeh Lower Bound 11

Games and Policy Iteration Variables of the primal LP t 6 x a is the expected number of x 2 = 1 1 times action a is used, summed 3 2 3 over all starting states. x 4 = 2 2 0 x 1 = 0 x 3 = 0 -4 0 x 5 = 0 x 6 = 1 -1 -1 Oliver Friedmann (LMU) Zadeh Lower Bound 12

Games and Policy Iteration Variables of the primal LP t 6 x a is the expected number of x 2 = 1 1 times action a is used, summed 3 2 3 over all starting states. x 4 = 2 2 0 We have: x 1 = 0 � � r a x π x 3 = 0 val π ( i ) = a -4 0 a ∈ π i ∈ S x 5 = 0 x 6 = 1 -1 -1 Oliver Friedmann (LMU) Zadeh Lower Bound 12

Games and Policy Iteration From MDP to LP max − 1 + 2 x 1 − 2 x 3 − x 5 x 2 = 1 − 1 3 x 1 + 2 3 x 3 + 2 s.t. 3 x 5 x 4 = 2 − x 3 − x 5 6 t x 6 = 1 − x 5 x 1 , x 2 , x 3 , x 4 , x 5 , x 6 ≥ 0 x 2 = 1 1 3 2 3 2 0 x 4 = 2 x 1 = 0 x 3 = 0 0 -4 x 5 = 0 x 5 x 1 x 6 = 1 -1 -1 { x 2 , x 4 , x 6 } x 3 Oliver Friedmann (LMU) Zadeh Lower Bound 13

Games and Policy Iteration From MDP to LP max 5 − 6 x 2 + 2 x 3 + 3 x 5 s.t. x 1 = 3 − 3 x 2 + 2 x 3 + 2 x 5 x 4 = 2 − x 3 − x 5 6 t x 6 = 1 − x 5 x 1 , x 2 , x 3 , x 4 , x 5 , x 6 ≥ 0 x 2 = 0 1 3 2 3 6 6 x 4 = 2 x 1 = 3 x 5 x 3 = 0 0 -4 x 5 = 0 { x 1 , x 4 , x 6 } x 3 x 2 x 6 = 1 -1 -1 Oliver Friedmann (LMU) Zadeh Lower Bound 13

Games and Policy Iteration From MDP to LP max 9 − 6 x 2 − 2 x 4 + x 5 s.t. x 1 = 7 − 3 x 2 − 2 x 4 x 3 = 2 − x 4 − x 5 6 t x 6 = 1 − x 5 x 1 , x 2 , x 3 , x 4 , x 5 , x 6 ≥ 0 x 2 = 0 1 3 2 3 6 6 x 4 = 0 x 1 = 7 x 5 x 3 = 2 2 -4 { x 1 , x 3 , x 6 } x 5 = 0 x 4 x 6 = 1 1 -1 x 2 Oliver Friedmann (LMU) Zadeh Lower Bound 13

Games and Policy Iteration From MDP to LP max 10 − 6 x 2 − 2 x 4 − x 6 s.t. x 1 = 7 − 3 x 2 − 2 x 4 x 3 = 1 − x 4 + x 6 6 t x 5 = 1 − x 6 x 1 , x 2 , x 3 , x 4 , x 5 , x 6 ≥ 0 x 2 = 0 1 3 2 3 6 6 x 4 = 0 { x 1 , x 3 , x 5 } x 4 x 1 = 7 x 2 x 6 x 3 = 1 2 -4 x 5 = 1 x 6 = 0 2 -1 Oliver Friedmann (LMU) Zadeh Lower Bound 13

Games and Policy Iteration Diameter Question: theoretically possible to have polynomially many iterations? Let G be a Markov decision process and n be the number of nodes. Definition: the diameter of G is the least number of iterations required to solve G Small Diameter Theorem The diameter of G is less or equal to n . Oliver Friedmann (LMU) Zadeh Lower Bound 14

Lower Bound for Zadeh’s Rule Lower Bound for Zadeh’s Rule Oliver Friedmann (LMU) Zadeh Lower Bound 15

Lower Bound for Zadeh’s Rule Lower bound construction We define a family of lower bound MDPs G n such that the Least-Entered pivoting rule will simulate an n -bit binary counter. Oliver Friedmann (LMU) Zadeh Lower Bound 16

Lower Bound for Zadeh’s Rule Lower bound construction We define a family of lower bound MDPs G n such that the Least-Entered pivoting rule will simulate an n -bit binary counter. We make use of exponentially growing rewards (and penalties): To get a higher reward the MDP is willing to sacrifice everything that has been built up so far. Oliver Friedmann (LMU) Zadeh Lower Bound 16

Lower Bound for Zadeh’s Rule Lower bound construction We define a family of lower bound MDPs G n such that the Least-Entered pivoting rule will simulate an n -bit binary counter. We make use of exponentially growing rewards (and penalties): To get a higher reward the MDP is willing to sacrifice everything that has been built up so far. Notation: Integer priority p corresponds to reward ( − N ) p , where N = 7 n + 1. . . . < 5 < 3 < 1 < 2 < 4 < 6 < . . . ( − N ) 5 5 for Oliver Friedmann (LMU) Zadeh Lower Bound 16

Lower Bound for Zadeh’s Rule Background The use of priorities is inspired by parity games . Oliver Friedmann (LMU) Zadeh Lower Bound 17

Lower Bound for Zadeh’s Rule Background The use of priorities is inspired by parity games . Friedmann (2009): The strategy iteration algorithm may require exponentially many iterations to solve parity games. Fearnley (2010): The strategy iteration algorithm may require exponentially many iterations to solve MDPs. Oliver Friedmann (LMU) Zadeh Lower Bound 17

Lower Bound for Zadeh’s Rule Background The use of priorities is inspired by parity games . Friedmann (2009): The strategy iteration algorithm may require exponentially many iterations to solve parity games. Fearnley (2010): The strategy iteration algorithm may require exponentially many iterations to solve MDPs. We also first proved a lower bound for parity games and then transferred the result to MDPs and linear programs. Oliver Friedmann (LMU) Zadeh Lower Bound 17

Lower Bound for Zadeh’s Rule Related game-theoretic settings LP-type problems Abstract Concrete Turn-based stochastic games Linear programming 2 1 / 2 players Mean payoff games Markov decision problems 2 players 1 1 / 2 players Parity games Deterministic MDPs 2 players 1 player Oliver Friedmann (LMU) Zadeh Lower Bound 18

Lower Bound for Zadeh’s Rule Related game-theoretic settings LP-type problems Abstract Concrete Turn-based stochastic games Linear programming 2 1 / 2 players Mean payoff games Markov decision problems 2 players 1 1 / 2 players ∈ NP ∩ co NP ∈ P Parity games Deterministic MDPs 2 players 1 player Oliver Friedmann (LMU) Zadeh Lower Bound 18

Lower Bound for Zadeh’s Rule Zadeh’s pivoting rule Zadeh’s Least-Entered rule Perform single switch that has been applied least often. (taken from David Avis’ paper) Oliver Friedmann (LMU) Zadeh Lower Bound 19

Lower Bound for Zadeh’s Rule Tie-Breaking Rule Tie-Breaking Rule = method of selecting a switch in case of a tie (w.r.t. the occurrence record) Proof of Small Diameter Theorem implies: Corollary There is a tie-breaking rule s.t. Zadeh’s rule requires linearly many iterations in the worst-case. Consequence: lower bound construction is equipped with particular tie-breaking rule Oliver Friedmann (LMU) Zadeh Lower Bound 20

Lower Bound for Zadeh’s Rule Binary Counting 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TPrinciple: If a bit can be set, then all bits can be set. y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 T Tie-Breaking: We decide to set the first bit. y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 0 0 1 R 0 0 1 1 0 0 1 1 0 0 1 1 T Set the second bit and reset the first bit. y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 0 T Set the first bit again. y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 0 0 1 1 0 0 1 1 0 0 1 2 0 0 1 1 T Continue... y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 0 1 R R 0 1 1 1 0 1 1 2 0 1 1 1 T Continue... y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 0 1 1 R 0 1 1 1 0 1 2 3 0 1 1 1 T Continue... y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 1 R R R 1 1 1 1 1 1 2 4 1 1 1 1 T Continue... y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 1 1 R R 1 1 1 1 1 2 3 6 1 1 1 1 T Continue... y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting 1 1 1 1 1 1 1 1 1 2 4 8 1 1 1 1 T Problem: Occurrence record unbalanced! y Oliver Friedmann (LMU) Zadeh Lower Bound 21

Lower Bound for Zadeh’s Rule Binary Counting (... again!) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TLet’s do it again - watch the occurrence record this time! y Oliver Friedmann (LMU) Zadeh Lower Bound 22

Lower Bound for Zadeh’s Rule Binary Counting (... again!) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T Everything okay so far... y Oliver Friedmann (LMU) Zadeh Lower Bound 22

Lower Bound for Zadeh’s Rule Binary Counting (... again!) 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 T Everything okay so far... y Oliver Friedmann (LMU) Zadeh Lower Bound 22

Lower Bound for Zadeh’s Rule Binary Counting (... again!) 0 0 1 R 0 0 1 1 0 0 1 1 0 0 1 1 T Everything okay so far... y Oliver Friedmann (LMU) Zadeh Lower Bound 22

Lower Bound for Zadeh’s Rule Binary Counting (... again!) 0 0 1 0 0 0 1 0 0 0 1 1 0 0 1 0 T Problem: We have to set one of the higher bits now! y Oliver Friedmann (LMU) Zadeh Lower Bound 22

Lower Bound for Zadeh’s Rule Binary Counting with conjunctive bits 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TReplace gadget by two-bit, conjunctive structure. y Oliver Friedmann (LMU) Zadeh Lower Bound 23

Lower Bound for Zadeh’s Rule Binary Counting with conjunctive bits 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 T Gadget is set iff both edges are going in. y Oliver Friedmann (LMU) Zadeh Lower Bound 23

Lower Bound for Zadeh’s Rule Binary Counting with conjunctive bits 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 T Set one improving edge of every gadget. y Oliver Friedmann (LMU) Zadeh Lower Bound 23

Lower Bound for Zadeh’s Rule Binary Counting with conjunctive bits 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 1 0 0 0 1 0 0 0 1 T Set other improving edge of first gadget. y Oliver Friedmann (LMU) Zadeh Lower Bound 23

Lower Bound for Zadeh’s Rule Binary Counting with conjunctive bits 0 0 0 1 0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 1 0 0 0 1 T Other gadgets have updated to their old setting. y Oliver Friedmann (LMU) Zadeh Lower Bound 23

Lower Bound for Zadeh’s Rule Binary Counting with conjunctive bits 0 0 0 1 0 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 T Set one improving edge of every gadget again. y Oliver Friedmann (LMU) Zadeh Lower Bound 23

Lower Bound for Zadeh’s Rule Binary Counting with conjunctive bits 0 0 1 R 0 0 1 1 1 1 2 1 0 0 1 1 1 1 1 1 1 1 1 1 T Set other improving edge of second gadget. y Oliver Friedmann (LMU) Zadeh Lower Bound 23

Lower Bound for Zadeh’s Rule Binary Counting with conjunctive bits 0 0 1 0 0 0 1 0 1 1 2 1 0 0 1 0 1 1 1 1 0 0 1 0 T Reset all other gadgets. y Oliver Friedmann (LMU) Zadeh Lower Bound 23

Lower Bound for Zadeh’s Rule Binary Counting with conjunctive bits 0 0 1 0 1 1 1 1 2 2 2 2 0 0 1 0 1 1 1 1 0 0 1 0 T Everything okay so far, continue... y Oliver Friedmann (LMU) Zadeh Lower Bound 23

A subexponential lower bound for Zadehs pivoting rule for solving - PowerPoint PPT Presentation

A subexponential lower bound for Zadehs pivoting rule for solving linear programs and games Oliver Friedmann Department of Computer Science, Ludwig-Maximilians-Universit at Munich, Germany. Oliver Friedmann (LMU) Zadeh Lower Bound 1

Subexponential lower bounds for randomized pivoting rules for the simplex algorithm Oliver

Covariance Matrices & All-pairs Similarity Reza Zadeh Introduction Reza Zadeh First Pass

Subexponential Time Algorithms via Concentration Bounds Manish Purohit, Anshul Sawant 2014-12-08

Scaled Machine Learning at Matroid Reza Zadeh @Reza_Zadeh | http://reza-zadeh.com Machine Learning

The Three Dimensions of Scalable Machine Learning Reza Zadeh @Reza_Zadeh | http://reza-zadeh.com

Distributed Machine Learning on Spark Reza Zadeh @Reza_Zadeh | http://reza-zadeh.com Outline

Advanced Data Science on Spark Reza Zadeh @Reza_Zadeh | http://reza-zadeh.com Data Science Problem

Sorting Lower Bound Sorting Lower Bound 1 Comparison-Based Sorting (10.4) Many sorting

Branch-and-Bound Math 482, Lecture 33 Misha Lavrov April 27, 2020 Branch-and-bound methods

QR factorization with column pivoting: a computer scientists perspective Edward Hutter

Solving Ax=b with Pivoting Solving Ax=b with Gaussian Elimination and LU and partial

23 Advanced Topics 5: Multi-lingual Models Up until now, we have assumed that in the case of

Upper and Lower Loop Bound Estimation by Symbolic Execution and Loop Acceleration Pavel Cadek

A Superpolynomial Lower Bound for Clique Function Circuits with at most 1 6 loglog n Negation

Consistency Analysis for Massively Inconsistent Datasets in Bound-to-Bound Data Collaboration

Rightward Bound: The Rise of Conservatism in Postwar America Rightward Bound : The Rise of

Uni.lu HPC School 2019 PS14: Distributed Mixed-Integer Programming (MIP) optimization with Cplex

P a r t 1 7 L i n e a r p r o g r a m m i n g 2 : A n a v e s o

Effective Linear Programming-Based Placement Techniques Sherief Reda Sherief Reda Amit

Module 8 Linear Programming CS 886 Sequential Decision Making and Reinforcement Learning

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

r rr t

Introduction to Linear Programming Consider the Diet problem: - n food items, m nutrients - for

Bilevel Integer Programming Ted Ralphs 1 Joint work with: Scott DeNegre 1 , Menal Guzelsoy 2 ,

Sambuz

Useful Links

Newsletter

Mail Us

A subexponential lower bound for Zadehs pivoting rule for solving - PowerPoint PPT Presentation

A subexponential lower bound for Zadehs pivoting rule for solving linear programs and games Oliver Friedmann Department of Computer Science, Ludwig-Maximilians-Universit at Munich, Germany. Oliver Friedmann (LMU) Zadeh Lower Bound 1

Subexponential lower bounds for randomized pivoting rules for the simplex algorithm Oliver

Covariance Matrices &amp; All-pairs Similarity Reza Zadeh Introduction Reza Zadeh First Pass

Subexponential Time Algorithms via Concentration Bounds Manish Purohit, Anshul Sawant 2014-12-08

Scaled Machine Learning at Matroid Reza Zadeh @Reza_Zadeh | http://reza-zadeh.com Machine Learning

The Three Dimensions of Scalable Machine Learning Reza Zadeh @Reza_Zadeh | http://reza-zadeh.com

Distributed Machine Learning on Spark Reza Zadeh @Reza_Zadeh | http://reza-zadeh.com Outline

Advanced Data Science on Spark Reza Zadeh @Reza_Zadeh | http://reza-zadeh.com Data Science Problem

Sorting Lower Bound Sorting Lower Bound 1 Comparison-Based Sorting (10.4) Many sorting

Branch-and-Bound Math 482, Lecture 33 Misha Lavrov April 27, 2020 Branch-and-bound methods

QR factorization with column pivoting: a computer scientists perspective Edward Hutter

Solving Ax=b with Pivoting Solving Ax=b with Gaussian Elimination and LU and partial

23 Advanced Topics 5: Multi-lingual Models Up until now, we have assumed that in the case of

Upper and Lower Loop Bound Estimation by Symbolic Execution and Loop Acceleration Pavel Cadek

A Superpolynomial Lower Bound for Clique Function Circuits with at most 1 6 loglog n Negation

Consistency Analysis for Massively Inconsistent Datasets in Bound-to-Bound Data Collaboration

Rightward Bound: The Rise of Conservatism in Postwar America Rightward Bound : The Rise of

Uni.lu HPC School 2019 PS14: Distributed Mixed-Integer Programming (MIP) optimization with Cplex

P a r t 1 7 L i n e a r p r o g r a m m i n g 2 : A n a v e s o

Effective Linear Programming-Based Placement Techniques Sherief Reda Sherief Reda Amit

Module 8 Linear Programming CS 886 Sequential Decision Making and Reinforcement Learning

Support Vector Machines Support Vector Machines Hypothesis Space Hypothesis Space variable

r rr t

Introduction to Linear Programming Consider the Diet problem: - n food items, m nutrients - for

Bilevel Integer Programming Ted Ralphs 1 Joint work with: Scott DeNegre 1 , Menal Guzelsoy 2 ,

Sambuz

Useful Links

Newsletter

Mail Us

Covariance Matrices & All-pairs Similarity Reza Zadeh Introduction Reza Zadeh First Pass