Topics in Computational Sustainability CS 325 Spring 2016 Making - - PowerPoint PPT Presentation

topics in computational sustainability
SMART_READER_LITE
LIVE PREVIEW

Topics in Computational Sustainability CS 325 Spring 2016 Making - - PowerPoint PPT Presentation

Topics in Computational Sustainability CS 325 Spring 2016 Making Choices: Stochastic Optimization Introduction Stochastic programming is a modeling framework to deal with optimization problems that involve uncertainty. Real world


slide-1
SLIDE 1

Topics in Computational Sustainability

CS 325

Spring 2016

Making Choices: Stochastic Optimization

slide-2
SLIDE 2

Introduction

  • Stochastic programming is a modeling framework to deal with
  • ptimization problems that involve uncertainty.
  • Real world problems almost invariably include some unknown

and uncertain parameters (e.g., how much energy the wind farm will actually produce, the actual costs of a project, ..)

  • Probability distributions over uncertain parameters are known
  • r can be estimated from data (machine learning).
slide-3
SLIDE 3

Introduction

Example

  • Farmer can plant his land with either corn, soy, or beans.
  • For simplicity, assume that the season will either be wet or

dry

  • If it is wet, corn is the most profitable
  • If it is dry, soy is the most profitable.
slide-4
SLIDE 4

Profit

All Corn All Soy All Beans Wet 100 70 80 Dry

  • 10

40 35

  • If it is wet, corn is the most profitable  plant all corn
  • If it is dry, soy is the most profitable  plant all soy
slide-5
SLIDE 5

Profit

All Corn All Soy All Beans Wet 100 70 80 Dry

  • 10

40 35 Assume the probability of a wet season is p, the expected profit of planting the different crops: Corn: 100 p + (-10) (1-p) = -10+ 110p Soy: 40+ 30p Beans: 35+ 45p

slide-6
SLIDE 6

What is the answer ?

Suppose p = 0.5, can anyone suggest a planting plan?

– If it is wet, corn is the most profitable  plant all corn – If it is dry, soy is the most profitable  plant all soy – Plant 1/2 corn, 1/2 soy ?

Expected Profit: 0.5 (-10 + 110(0.5)) + 0.5 (40 + 30(0.5))= 50 Is this optimal?

slide-7
SLIDE 7

Optimal strategy

Suppose p = 0.5, can anyone suggest a planting plan? Plant all beans! Expected Profit: 35 + 45(0.5) = 57.5!

slide-8
SLIDE 8

What Did We Learn ?

  • Averaging Solutions Doesn’t Work!
  • The best decision for today, when faced with a number of

different outcomes for the future, is not equal to the “average” of the decisions that would have been best for each specific future outcome.

slide-9
SLIDE 9

Discrete random variables

Discrete random variable Z is described by mass probabilities of all elementary events:

, ,..., , ,..., ,

2 1 2 1 K K

p p p z z z

1 ...

2 1

   

K

p p p

Such that

slide-10
SLIDE 10

Discrete random variables

If probability measure is discrete, the expected value of Z is the sum : Example: If Z represents the outcome of a die Similarly, given a function f

 

K i i i p

z Z E

1

] [ 5 . 3 6 / 6 6 / 5 6 / 4 6 / 3 6 / 2 6 / 1 ] [        Z E

 

K i i i

p z f Z f E

1

) ( )] ( [

slide-11
SLIDE 11

Continuous random variables

The expected value of random function is integral: Where p(z) is the density function of a continuous random variable

   dz z p z x f x f E x F ) ( ) , ( )] , ( [ ) (  ) , (  x f

1 ) ( dz z p

slide-12
SLIDE 12

Stochastic Programming

  • Unconstrained stochastic programming problem:

 

dz z p z x f x f E x F

X x

) ( ) , ( ] , [ ) ( min

 

    Here the set X specifies which solutions are feasible (e.g., through constraints)

slide-13
SLIDE 13

Example

) 35 40 ) 10 ( )( 1 ( ) 80 70 100 ( )] , , , ( [ ) , , (

3 2 1 3 2 1 3 2 1 3 2 1

               x x x p x x x p Z x x x f E x x x F

Crop yield optimization: Z is a binary random variable: wet (Z=1, with probability p) or dry season (Z=0, prob. 1-p) The expected value is

)) 1 ( 35 80 ( )) 1 ( 40 70 ( )) 1 ( 10 100 ( ) , , , (

3 2 1 3 2 1

Z Z x Z Z x Z Z x Z x x x f                    

slide-14
SLIDE 14

Stochastic programming example

)) 1 ( 35 80 ( )) 1 ( 40 70 ( )) 1 ( 10 100 ( ) , , (

3 2 1 3 2 1

p p x p p x p p x x x x F                    

Crop yield optimization problem. Given p, subject to

. 1 , , ,

3 2 1 3 2 1

      x x x x x x

maximize

slide-15
SLIDE 15

Stochastic Programming

  • Unconstrained stochastic programming problem:

 

dz z p z x f x f E x F

X x

) ( ) , ( ] , [ ) ( min

 

    How to solve?

– Might not be possible to evaluate the integral in closed form – Computationally hard to evaluate

slide-16
SLIDE 16

Sample average approximation

  • Instead of this
  • Sample according to p(z) and solve
  • is the sample average function

– The expected value is f(x)

 

dz z p z x g x g E x f

S x

) ( ) , ( ] , [ ) ( min

 

   

N

   ,..., ,

2 1

slide-17
SLIDE 17

Statistics break

slide-18
SLIDE 18

SAA gives a lower bound

slide-19
SLIDE 19

Proof

slide-20
SLIDE 20

Estimating the lower bound

slide-21
SLIDE 21

Application

  • Food security is a global issue

– 7.4 Billion people to feed; 21 Million newborns in the past year

  • Ways to improve production of crops

– Increase arable land – Improve yield with technology

  • 2016 Syngenta Yield Prediction challenge: select

best (soy) varieties to improve yield:

– Use knowledge of soil/regional data – Understand the uncertainty due to weather/climate – Around 34000 data points, 80 varieties

slide-22
SLIDE 22

Our hierarchical model

Li, Zhong, Lobell, Ermon: first prize out of 130 teams

slide-23
SLIDE 23

Dealing with uncertainty

  • Two sources of uncertainty

– Weather – Errors in variety yield prediction

  • It is hard to fit a parametric model
  • Solution: sample from historical data

– historical weather distribution at the site of interest – Errors from our yield prediction (non-Gaussian)

slide-24
SLIDE 24

Hedging risk

  • Which one would you pick?

– 100 dollars with probability 0.5, nothing with probability 0.5 – 50 dollars with probability 1

  • Different criteria. Choose a mix of varieties to

– Maximize expected yield minus variance – Maximize expected yield, subject to small variance – Maximize the yield that you can achieve with probability at least 95%

slide-25
SLIDE 25

Results

slide-26
SLIDE 26

Results

slide-27
SLIDE 27

Results

slide-28
SLIDE 28

Maximizing the Spread of Cascades Using Network Design

with application to spatial conservation planning

Daniel Sheldon, Bistra Dilkina, Adam Elmachtoub, Ryan Finseth, Ashish Sabharwal, Jon Conrad, Carla P. Gomes, David Shmoys Institute for Computational Sustainability Cornell University and Oregon State University Will Allen, Ole Amundsen, Buck Vaughan The Conservation Fund

slide-29
SLIDE 29

Spatial Conservation Planning

  • What is the best land acquisition and management strategy to

support the recovery of the Red-Cockaded Woodpecker (RCW)?

! "# $%# "

Federally listed rare and endangered species

slide-30
SLIDE 30

RCW 101

  • Cooperative breeders

– small family groups – well-defined territories or patches – centered around cluster of cavity trees

  • Cavities!

– One for each bird – Live, old-growth pine (80+ years old) – 2-10 years to excavate – Extensively reused

  • Habitat requirements in

conflict with modern land-use

– 30-year timber rotation – Development

  • Management

– Habitat restoration and preservation – Artificial cavities

slide-31
SLIDE 31

Problem Setup

Given limited budget, what parcels should I conserve to maximize the expected number of occupied territories in 50 years?

! "# $%# "

Conserved parcels Available parcels Current territories Potential territories

slide-32
SLIDE 32

Metapopulation Model

  • Model for population dynamics in fragmented

landscape

– Territories are occupied or unoccupied in each time step – Two types of stochastic events:

  • Local extinction: occupied -> unoccupied
  • Colonization: unoccupied -> occupied (from neighbor)

Time 1 Time 2

slide-33
SLIDE 33

Network Cascades

  • Models for diffusion in (social) networks

– Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½

slide-34
SLIDE 34

Network Cascades

  • Models for diffusion in (social) networks

– Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½

slide-35
SLIDE 35

Network Cascades

  • Models for diffusion in (social) networks

– Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½

slide-36
SLIDE 36

Network Cascades

  • Models for diffusion in (social) networks

– Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½

slide-37
SLIDE 37

Network Cascades

  • Models for diffusion in (social) networks

– Spread of information, behavior, disease, etc. – E.g.: suppose each individual passes rumor to friends independently with probability ½

Note: “activated” nodes are those reachable by red edges

slide-38
SLIDE 38
  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

i j k l m i j k l m i j k l m i j k l m i j k l m 1 2 3 4 5 Time: Initiall y

  • ccupie

d territori es

Metapopulation = Cascade

Patches

slide-39
SLIDE 39
  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

i j k l m i j k l m i j k l m i j k l m i j k l m 1 2 3 4 5 Time: p(i,i) p(i,j)

Metapopulation = Cascade

Patches

slide-40
SLIDE 40

i j k l m i j k l m i j k l m i j k l m i j k l m 1 2 3 4 5 Time:

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

Metapopulation = Cascade

Patches

slide-41
SLIDE 41

i j k l m i j k l m i j k l m i j k l m i j k l m 1 2 3 4 5 Time:

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

Metapopulation = Cascade

Non-extinction Colonization Patches

slide-42
SLIDE 42

i j k l m i j k l m i j k l m i j k l m i j k l m 1 2 3 4 5 Time:

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

Metapopulation = Cascade

Patches

slide-43
SLIDE 43

Metapopulation = Cascade

i j k l m i j k l m i j k l m i j k l m i j k l m 1 2 3 4 5 Time:

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

Patches

slide-44
SLIDE 44

Metapopulation = Cascade

i j k l m i j k l m i j k l m i j k l m i j k l m 1 2 3 4 5 Time:

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

Patches

slide-45
SLIDE 45

Metapopulation = Cascade

i j k l m i j k l m i j k l m i j k l m i j k l m 1 2 3 4 5 Time:

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

Patches

slide-46
SLIDE 46

Metapopulation = Cascade

i j k l m i j k l m i j k l m i j k l m i j k l m 1 2 3 4 5 Time:

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

Patches

slide-47
SLIDE 47

Metapopulation = Cascade

i j k l m i j k l m i j k l m i j k l m i j k l m

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

1 2 3 4 5 Time: Patches

slide-48
SLIDE 48

Metapopulation = Cascade

i j k l m i j k l m i j k l m i j k l m i j k l m

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

Key point: after simulation, occupied territories given by nodes that are reachable in the network by live edges Live edges Patches

slide-49
SLIDE 49

Metapopulation = Cascade

i j k l m i j k l m i j k l m i j k l m i j k l m

  • Metapopulation model can be viewed as a cascade in

the layered graph representing territories over time

Target nodes: territories at final time step Patches

slide-50
SLIDE 50

Management Actions

  • Conserving parcels adds nodes to the network to

create new pathways for the cascade

Parcel 1 Parcel 2 Initial network

slide-51
SLIDE 51

Management Actions

Parcel 1 Parcel 2 Initial network

  • Conserving parcels adds nodes to the network to

create new pathways for the cascade

slide-52
SLIDE 52

Management Actions

Parcel 1 Parcel 2 Initial network

  • Conserving parcels adds nodes to the network to

create new pathways for the cascade

slide-53
SLIDE 53

Cascade Optimization Problem

Given:

  • Patch network

– Initially occupied territories – Colonization and extinction probabilities

  • Management actions

– Already-conserved parcels – List of available parcels and their costs

  • Time horizon T
  • Budget B

Find set of parcels with total cost at most B that maximizes the expected number of occupied territories at time T.

slide-54
SLIDE 54

Approach

  • Stochastic problem too hard

– Cannot even calculate objective function exactly (#P-hard)

  • Sample average approximation (SAA):
  • Replace stochastic problem by deterministic one
  • Draw N outcomes from underlying probability space
  • Optimize empirical average

???

slide-55
SLIDE 55

Sample Average Approximation

  • Sample N training cascades by flipping coins for all edges.

...

  • Select single set of

management actions that works well on average

  • Goal: maximize the number of

reachable target nodes

  • A deterministic network

design problem 1 2 N

slide-56
SLIDE 56

Network Design

... Stochastic Deterministic

slide-57
SLIDE 57

Mixed Integer Program

NP hard: solve by branch and bound (CPLEX) Node v reachable in cascade k? v only reachable if some containing parcel l is purchased non-source node v only reachable if some predecessor u is reachable Budget Purchase parcel l?

slide-58
SLIDE 58

Experiments

  • 443 available parcels
  • 2500 territories
  • 63 initially occupied
  • 100 years

! "# $%# "

  • Population model is parameterized based (loosely) on RCW ecology
  • Short-range colonizations (<3km) within the foraging radius of the

RCW are much more likely than long-range colonizations

slide-59
SLIDE 59

Greedy Baselines

  • Adapted from previous work on influence

maximization

  • Start with empty set, add actions until exhaust

budget

– Greedy-uc – choose action that results in biggest immediate increase in objective [Kempe et al. 2003] – Greedy-cb – use ratio of benefit to cost [Leskovec et

  • al. 2007]
  • No performance guarantees!
slide-60
SLIDE 60

Results

! "# $%# "

M = 50, N = 10, Ntest = 500 Upper bound!

slide-61
SLIDE 61

Results

! "# $%# "

M = 50, N = 10, Ntest = 500 Upper bound!

slide-62
SLIDE 62

Results

! "# $%# "

Conservation Reservoir Initial population M = 50, N = 10, Ntest = 500 Upper bound!

slide-63
SLIDE 63

SAA Convergence

slide-64
SLIDE 64

A Harder Instance

! "# $%# "

Move the conservation reservoir so it is more remote.

slide-65
SLIDE 65

Conservation Strategies

Greedy Baselin e SAA Optimum (our approach) $150M $260M $320M Build

  • utward from

sources Path-building (goal-setting)