Lifelong Optimisation (FA2386-12-1-4056) PI: Peter Stuckey, Pascal - - PowerPoint PPT Presentation

lifelong optimisation fa2386 12 1 4056
SMART_READER_LITE
LIVE PREVIEW

Lifelong Optimisation (FA2386-12-1-4056) PI: Peter Stuckey, Pascal - - PowerPoint PPT Presentation

Lifelong Optimisation (FA2386-12-1-4056) PI: Peter Stuckey, Pascal Van Hentenryck (NICTA, Melbourne), Toby Walsh (NICTA, Sydney) Senior Personnel: Dr Andreas Schutt (Melbourne), Dr Nina Narodytska (Melbourne) AFOSR Program Review: Mathematical


slide-1
SLIDE 1

Lifelong Optimisation

(FA2386-12-1-4056) PI: Peter Stuckey, Pascal Van Hentenryck (NICTA, Melbourne), Toby Walsh (NICTA, Sydney)

Senior Personnel: Dr Andreas Schutt (Melbourne), Dr Nina Narodytska (Melbourne)

AFOSR Program Review:

Mathematical and Computational Cognition Program Computational and Machine Intelligence Program Robust Decision Making in Human-System Interface Program (Jan 28 – Feb 1, 2013, Washington, DC)

slide-2
SLIDE 2

Lifelong Optimisation (Van Hentenryck, Stuckey, Walsh)

Research Objectives: Treat optimisation not as an one-off process, but as an ongoing one, so that decision support tools actually improve in their performance with time DoD Benefits: Robust optimisation methods that, like humans, cope well with uncertainty, explain their answers and get better with repeated use. Potential applications in logistics, and related problems. Technical Approach:

  • 1. Develop optimisation that learn

from past problems (e.g. learn useful constraints, improve heuristics, etc). 2.Develop methods to explain solutions (so we learn to trust solvers over time) Budget ($k):

Project Start Date: March 2012 Project End Date: March 2015 2

2012 2014 (option) $291,200 $145,600

slide-3
SLIDE 3

Project Goals 1. Learning from past problems

1. Learning constraints (aka “nogood” learning) 2. Learning heuristics 3. Robust solving

2. Explaining solutions

1. Explaining optimality 2. Explaining unsatisfiability (aka infeasibility)

3

slide-4
SLIDE 4

Progress Towards Goals Learning problem constraints Technical challenge: how do we cope with the new data to today’s problem instance? Solution: learn parameterized constraints that abstract

  • ut the data

Learning problem heuristics Focus on online optimisation problems Technical challenge: how do we measure the quality

  • f a heuristic’s decision when the future is uncertain?

Solution: elegant combination of machine learning +

  • ptimisation post hoc

4

slide-5
SLIDE 5

5

Outline

  • Motivation: Lifelong learning
  • Lifelong learning of constraints
  • Lifelong learning of heuristics
slide-6
SLIDE 6

6

Motivation

  • Lifelong optimisation

– Decision tools treat optimisation as an “one

  • off” problem

– But we should view today’s problem in the context of yesterday’s and tomorrow’s

slide-7
SLIDE 7

Lifelong optimisation

  • New problems are often similar to old ones

– We can learn to solve them better

  • Today’s problem may not be separable

from tomorrow’s

– We want to send our drivers back to the same drop-offs – Drivers want to do similar jobs every day – What we deliver today may change what we need to deliver tomorrow – …

7

slide-8
SLIDE 8

Lifelong learning of constraints

  • Learning constraints (specifically

“nogoods”) across related problems (today’s schedule, tomorrow’s schedule, ..)

8

slide-9
SLIDE 9

Inter-instance learning

  • Nogood learning allows CP solver to learn

nogoods from each conflict

– Nogoods describe the reason for the failure, i.e., a sufficient set of conditions on the domain to cause failure

  • Propagating these nogoods prevent the

solver from making the same search mistakes again

– Can produce orders of magnitude speedup

slide-10
SLIDE 10

Inter-instance learning

  • Many applications

where we solve multiple, similar instances of a problem

– e.g. user may interact with solving process by adding, removing

  • r altering tasks,

resources, …

slide-11
SLIDE 11

Inter-instance learning

  • Weakness: Current nogood learning

techniques only work within a single instance

  • Nogoods are not valid in another instance
  • Each instance must be solved from

scratch

– Ignores similarities of instances

  • We want to extend nogood learning:

– carry what we learned from one instance to the next

slide-12
SLIDE 12

Nogood Learning

  • Current state of the art is called Lazy

Clause Generation

– Developed under our previous AOARD grant

  • Each propagator has to explain each of its

inferences

  • e.g. Given constraint x ≤ y, and domain x

∈ [3..10]

– infer y ≥ 3 and explain with: x ≥ 3 -> y ≥ 3

slide-13
SLIDE 13

Parameterized Nogoods

  • Standard nogood is only valid in the

instance in which it was derived

  • We generalize LCG to learn

parameterized nogoods

– valid for entire problem class

slide-14
SLIDE 14

Problem class model

  • Create a problem class model

– (V U Q, D, C, f), where Q is a set of parameter variables

  • Instance created by fixing Q to some set of

values

  • E.g. In graph colouring, we can have V ≡

{v1, ..., vn}, Q ≡ {ai,j | i, j = 1..n}, C ≡ ai,j -> vi ≠ vj. The values of Q specify which edges exist.

slide-15
SLIDE 15

Example 3

  • Consider the 3 graph colouring instances above where we are

trying to 2-colour the graph

  • Suppose we tried v1 = 1, v2 = 2 in first instance. It fails and we

learn a1,3 /\ a2,3 /\ v1 = 1 /\ v2 = 2 -> false

  • Reuse: parameter literals a1,3 /\ a2,3

– Not true in second instance -> can't use it – Holds in third instance -> can use it

slide-16
SLIDE 16

Experimental evaluation

  • Problems: Radiation Therapy Problem,

Minimization of Open Stacks Problem, Graph Colouring Problem, Knapsack Problem

  • For each, create 100 "base" instances.

– Then create modified versions of each base instance with small changes

  • How much speedup we get if we reuse

nogoods from base instance compared to solving from scratch?

slide-17
SLIDE 17

Results

  • More similar -> more reuse -> more speedup
  • Effectiveness depends on problem class
slide-18
SLIDE 18

Conclusion & Future Work

  • Generalized Lazy Clause Generation to

produce parameterized nogoods

  • They can be reused across multiple

instances of the same problem class

  • Highly effective if new instances to be

solved are very similar to previously solved instances

slide-19
SLIDE 19

Lifelong learning of heuristics

  • As well as problem constraints

(“nogoods”), we can also learn problem solving heuristics (“rules of thumb”)

– How I solve today’s problem should help me solve tomorrow’s – Heuristics are often even more re-usable than nogoods!

19

slide-20
SLIDE 20

Learning dispatch heuristics

20

Dispatchers in parcel pickup & delivery firms learn to solve complex vehicle routing problems How do they do it?

slide-21
SLIDE 21

Learning dispatch heuristics

  • Elegant combination of machine

learning and optimisation

– Reinforcement learning used to learn good dispatch heuristics – Optimisation used to compute good answers

  • Was it a good idea or not to dispatch this truck

to collect this parcel?

  • Fix this pickup, find best routing
  • Prohibit this pickup, find best routing

21

slide-22
SLIDE 22

Learning dispatch heuristics

  • Base dispatch heuristics

– Nearest truck – Cheapest pickup – Least busy truck – …

  • Features

– Time of day – Workload – Vehicle spread – ..

22

slide-23
SLIDE 23

Learning dispatch heuristics

  • Neural network

– Learns when to follow a particular base heuristic – Trained on real world data set from a local Sydney company

  • Performance

– Looks highly promising – Much better than their rooky dispatchers

23

slide-24
SLIDE 24

Conclusions

  • Learning across instances of optimisation

problems

– Learning nogoods (constraints) – Learning heuristics

  • Our research continues to receive

recognition

– Best Paper award at 18th Int. Conf. on Constraint Programming (Oct 2012) – Outstanding Programme Committee member award @ AI’12 (Dec 2012)

24

slide-25
SLIDE 25

List of Publications Attributed to the Grant

Heuristics and Policies for Online Pickup and Delivery Problems Under review for 25th Conference on Innovative Applications of Artificial Intelligence (IAAI-13) The SEQBIN Constraint Revisited. Principles and Practice of Constraint Programming, CP-2012, 2012. A hybrid MIP/CP approach for multi-activity shift scheduling. CP-2012, 2012. Exploiting subproblem dominance in constraint programming. Constraints 17(1): 1-38 (2012) Conflict Directed Lazy Decomposition. CP 2012: 70-85 Inter-instance Nogood Learning in Constraint Programming. CP 2012: 238-247 Optimisation Modelling for Software Developers. CP 2012: 274-289 25