CMU-Q 15-381
Lecture 8: Optimization I: Optimization for CSP Local Search
Teacher: Gianni A. Di Caro
CMU-Q 15-381 Lecture 8: Optimization I: Optimization for CSP - - PowerPoint PPT Presentation
CMU-Q 15-381 Lecture 8: Optimization I: Optimization for CSP Local Search Teacher: Gianni A. Di Caro L OCAL S EARCH FOR CSP Real-life CSPs can be very large and hard to solve Methods so far: construct a solution by assigning one
Teacher: Gianni A. Di Caro
2
§ Real-life CSPs can be very large and hard to solve … § Methods so far: construct a solution by assigning one variable at-a- time, if an assignment fails because of constraint violation, backtrack, and keep doing until all variables have been assigned feasible values § At any point of the construction process we have one partial solution (partial assignment of values to variables) § The states of the process are partial states of the problem
3
1. Start with some unfeasible assignment (i.e., featuring 𝑜 constraint violations) 2. LS operators reassign variable values (one or more at each search step)
involved in constraints violation)
choose a value such that the new CSP assignment violates the fewest constraints) 3. Iterate 2 (A-B) until a feasible solution is found or only a few constraint violations survive or … Local search methods: work with complete states (i.e., all variables assigned, it can be an unfeasible assignment)
4
Neighbor states (one color change)
5
Local search algorithms at each step consider a single “current” state, and try to improve it by moving to one of its neighbors ➔ Iterative improvement algorithms § Pros and cons
spaces
6
§ Move in the direction of strictly increasing value (up to the hill) § Steepest ascent / Steepest descent § Terminate when no neighbor has higher value § Greedy (myopic) local search § We necessarily end into a local optimum or a plateau § Which optimum: depends on the starting point
7
Like climbing Everest in thick fog with amnesia
8
18 14 13 13 14 14 14 16 13 15 14 16 14 18 13 15 14 14 15 14 14 13 16 13 16 14 17 15 14 16 16 17 16 18 15 15 18 14 15 15 14 16 14 14 13 17 14 18 12 12 12 12 12 12 12 12 State with 17 conflicts, showing the #conflicts by moving a queen within its column, with best moves in red Local optimum: state that has only
larger #conflicts
§ Hill-climbing can solve large instances of 𝑜-queens (𝑜 = 106) in a few seconds § 8 queens statistics:
climbing solves 14% of problem instances
gets stuck
more
9
10 state space Objective function global maximum shoulder “flat” local maximum current state neighborhood
Plateaux Local optima
11
§ Sideways moves: if no uphill moves, allow moving to a state with the same value as the current one (escape shoulders)
state space Objective function global maximum shoulder “flat” local maximum current state neighborhood
Plateaux Local optima
sideways moves (M): M=100 → 94% solved instances for the 8-queens! 21 steps avg. on success 64 steps avg. on “failure”
12
§ Sideways moves: if no uphill moves, allow moving to a state with the same value as the current one (escape shoulders) § Stochastic hill-climbing: selection among the available uphill moves is done randomly (uniform, proportional, soft-max, ε- greedy, …) to be “less” greedy § First-choice hill-climbing: successors are generated randomly,
found (deal with large neighborhoods) § Random-restart hill climbing: probabilistically complete (how do we select the next restart configuration?)
In general, these variants apply to all Local Search algorithms
13
Diagonal ridges: From each local maximum all the available actions point downhill, but there is an uphill path! Zig-zag motion, very long ascent time! Gradient ascent doesn’t have this issue: all state vector components are (potentially) changed when moving to a successor state, climbing can follow the direction of the ridge
14
§ min-conflicts heuristic h chooses a value such that the new CSP assignment violates the fewest constraints § Given a random initial state, can solve 𝑜-queens in almost constant time for very large 𝑜
The same appears to be true for any randomly-generated CSP except in a narrow range of the ratio:
15
§ Binary literals (true / false) § Clause: disjunction of literals § Conjunctive Normal Form (CNF) for a logical Formula: Conjunction of clauses 3-SAT (all clauses have 3 literals)
16
§ Random 3-SAT
space of all possible 3- clauses
§ Which are the hard instances?
' ( = 4.26
17
n Complexity peak is very
stable …
q across problem sizes q across solver types n
systematic
n
stochastic
18
§ At each step, the randomly chosen clause is satisfied, but other clauses may become unsatisfied § The parameter 𝑞 is called the "mixing probability" and determined approximately by experiment for a given class of CNF formulas § For random, hard 3-SAT problems (those with the ratio of clauses to variables around 4.25) 𝑞 = 0.5 works well § For 3-SAT formulas with more structure, as generated in many applications, slightly more greediness, i.e. 𝑞 < 0.5, is often better § Empirically, restarting after 𝑃(𝑜4) flips, 𝑜 = number of variables, works well
19
Function Search by Iterative Solution Modification() π = instance of optimization problem of class Π S = {set of all feasible solutions of π} N = neighborhood structure for Π, can be variable in ξ, t, m eval() = evaluation function for candidate solutions in N t = iteration, time ξt = search state at time t, current feasible solution mt = memory structure of search states and values t0 ← 0 m0 ← ∅ ξ0 ← initial feasible solution(π, S) while ¬ terminate(ξt, π, N(ξt, π), t, . . .) (ξ0, mt) ← step(N(ξt, π), mt, eval()) if accept(ξ0, ξt, t, mt) ξt+1 ← ξ0 mt+1 ← update solution best value(π, ξt+1, t) t ← t + 1 if at least one feasible solution has been generated(mt, S) return best solution found(m) else return “No feasible solution found!”
§ We only need to be able to compute the function … § No derivatives, analytical properties are needed