12/17/2019 Department of Veterinary and Animal Sciences - - PDF document

12 17 2019
SMART_READER_LITE
LIVE PREVIEW

12/17/2019 Department of Veterinary and Animal Sciences - - PDF document

12/17/2019 Department of Veterinary and Animal Sciences Hierarchical Markov decision processes Original slides by Anders Ringgaard Kristensen Presented by Dan Brge Jensen Department of Veterinary and Animal Sciences Outline Quick summary of


slide-1
SLIDE 1

12/17/2019 1

Hierarchical Markov decision processes

Original slides by Anders Ringgaard Kristensen Presented by Dan Børge Jensen

Department of Veterinary and Animal Sciences

Outline

Quick summary of Monday The markov property – revisited Graphical representation of models Hierarchical models Multi-level models Decisions on multiple time scale Markov chain simulation

Advanced Quantitative Methods in Herd Management Slide 2 Department of Veterinary and Animal Sciences

Summary from Monday

The Markov property:

  • nly the current state affects future states!

Optimization goal: find the best policy (decision strategy)

  • Different objective functions
  • Sum of (discounted) rewards over time
  • Average reward per time unit
  • Average reward per unit of product

Optimization methods:

  • Value iteration
  • Exact for finite time horizons
  • Non-exact for infinite time horizons
  • Policy iteration
  • Exact for infinite time horizon
  • Can not handle very large state spaces

Advanced Quantitative Methods in Herd Management Slide 3 Department of Veterinary and Animal Sciences

slide-2
SLIDE 2

12/17/2019 2

The Markov property Let in be the state at stage n The Markov property is satisfied if, and only if,

  • P(in+1| in, in-1, … , i1) = P (in+1| in)
  • In words: The distribution of the state at next stage

depends only on the present state – previous states are not relevant.

This property is crucial in Markov decision processes.

Advanced Quantitative Methods in Herd Management Slide 4 Department of Veterinary and Animal Sciences

Markov property: Example

Litter size in sows:

  • Litter size in sows may be represented as a multi

dimensional normal distribution from previous exercise.

  • We wish to predict litter size of parity n
  • How shall we define the state space in order to fulfill the

Markov property?

Department of Veterinary and Animal Sciences Slide 5

Markovian prediction of litter size I

Straight forward solution:

  • Include previous litter sizes as part of the

current state

  • For a sow in parity 8 this means e.g. 158 = 2.5 x

109 state combinations.

  • Prohibitive!
  • Trick most often used in practice:
  • Only include the 2 – 3 most recent litter size

results.

Department of Veterinary and Animal Sciences Slide 6

slide-3
SLIDE 3

12/17/2019 3

The model tree

Graphical representation of MDPs

Recall the structure of the simple dairy cow replacement model: Stage:

  • 1 lactation cycle

State:

  • i=1: Low milk yield
  • i=2: Average milk yield
  • i=3: High milk yield

Action:

  • d=1: Keep the cow
  • d=2: Replace the cow at the end of the stage

The structure may be displayed graphically as a model tree We will model 10 stages (finite horizon)

Advanced Quantitative Methods in Herd Management Slide 8 Department of Veterinary and Animal Sciences

The model displayed as a tree

We have a nested structure:

  • The root of the model is the process itself
  • The process holds 10 stages (the time horizon)
  • Each stage holds 3 states (Low, Average, High)
  • Each state holds 2 actions (Keep, Replace)

The parameters:

  • Each action has a set of parameters attached:
  • A reward
  • A probability distribution (to the states at next

stage). Implemented in the MLHMP software system

Advanced Quantitative Methods in Herd Management Slide 9 Department of Veterinary and Animal Sciences

slide-4
SLIDE 4

12/17/2019 4

The MLHMP software – the model tree window

Advanced Quantitative Methods in Herd Management Slide 10 Department of Veterinary and Animal Sciences

Summary of the model tree

The nested structure of an MDP is shown directly Each value (stage, state and action) is displayed as an icon of a certain type. A label is (optionally) attached to each value in order to ease the (human) interpretation of the values. Asymmetric models are easily handled (and displayed)

Advanced Quantitative Methods in Herd Management Slide 11 Department of Veterinary and Animal Sciences

5-10 Minute Break

slide-5
SLIDE 5

12/17/2019 5

The curse of dimensionality

Age dependency of milk yield

4400 4600 4800 5000 5200 5400 5600 5800 6000 1st Parity 2nd Parity 3rd Parity 4th Parity Kg ECM

Advanced Quantitative Methods in Herd Management Slide 14 Department of Veterinary and Animal Sciences

An extended model, I

State variables

  • Age
  • Parity 1
  • Parity 2
  • Parity 3
  • Parity 4
  • Relative milk yield
  • Low
  • Average
  • High

Advanced Quantitative Methods in Herd Management Slide 15 Department of Veterinary and Animal Sciences

slide-6
SLIDE 6

12/17/2019 6

An extended model, II

Advanced Quantitative Methods in Herd Management Slide 16 Department of Veterinary and Animal Sciences

An extended model, III

Advanced Quantitative Methods in Herd Management Slide 17 Department of Veterinary and Animal Sciences

An extended model, IV

Advanced Quantitative Methods in Herd Management Slide 18 Department of Veterinary and Animal Sciences

slide-7
SLIDE 7

12/17/2019 7

Let us take a look at the model tree

Advanced Quantitative Methods in Herd Management Slide 19 Department of Veterinary and Animal Sciences

Age and genotype dependency

1000 2000 3000 4000 5000 6000 7000

  • Par. 1
  • Par. 2
  • Par. 3
  • Par. 4

Low genetic merit Average genetic merit High genetic merit

Advanced Quantitative Methods in Herd Management Slide 20 Department of Veterinary and Animal Sciences

A further extended model

State variables

  • Genetic merit:
  • Low,
  • Average,
  • High
  • Age:
  • Parity 1,
  • Parity 2,
  • Parity 3,
  • Parity 4
  • (Relative) milk yield:
  • Low,
  • Average,
  • High

Department of Veterinary and Animal Sciences

slide-8
SLIDE 8

12/17/2019 8

Rewards and output

Department of Veterinary and Animal Sciences

Transition probabilities, Keep

Department of Veterinary and Animal Sciences

Transition probabilities, Replace

Department of Veterinary and Animal Sciences

slide-9
SLIDE 9

12/17/2019 9

We shall again take a look at the graphical display

Department of Veterinary and Animal Sciences

An example: Houben et al. (1994) State variables:

  • Age (monthly intervals, 204 levels)
  • Milk yield, present lactation (15 levels)
  • Milk yield, previous lactation (15 levels)
  • Length of calving interval (8 levels)
  • Mastitis, present lactation (4 levels)
  • Mastitis, previous lactation (4 levels)
  • Clinical mastitis (yes/no)

Total state space 6,821,724 states

Houben, E. P. H., R. B. M. Huirne, A. A. Dijkhuizen & A. R. Kristensen. 1994. Optimal replacement of mastitis cows determined by a hierarchic Markov process. Journal of Dairy Science 77, 2975-2993.

Department of Veterinary and Animal Sciences

The curse of dimensionality

If

  • state variables are represented at a realistic

number of levels

  • all relevant state variables are included in

the model

then

  • the state space grows to prohibitive

dimensions

Solution:

  • Hierarchical models

Advanced Quantitative Methods in Herd Management Slide 27 Department of Veterinary and Animal Sciences

slide-10
SLIDE 10

12/17/2019 10

5-10 Minute Break

Hierachical Markov Decision Processes

Important observations, transition matrix

Most elements are zero because

  • Age is included as a state variable
  • Some state variables are constant within animal
  • Some state variables are constant over several

stages

If state numbers are defined appropriately the non-zero elements are arranged in a certain pattern This can be utilized for a hierarchical organisation

  • f the state space!

Advanced Quantitative Methods in Herd Management Slide 30 Department of Veterinary and Animal Sciences

slide-11
SLIDE 11

12/17/2019 11

Illustration of the hierarchy for the example

Optimization technique

  • Policy iteration in the founder process (exact)
  • Value iteration in the child processes (exact)

The positive properties of both techniques are combined into a very efficient and exact hierarchic technique

Founder Child

Cow 1 Cow 2 Cow 3 1 2 3 4 1 2 3 4 1 2 Genetic merit Relative milk yield Keep/Replace Dummy (no action)

Department of Veterinary and Animal Sciences

The dairy cow replacement model as a hierarchical process

Founder process:

  • Stage: Life time of a cow
  • State: Genetic merit
  • Action: Dummy

Child process:

  • Stage: A lactation cycle
  • State: Milk yield (relative to genetic merit and lactation)
  • Action: Keep/Replace

Benefits:

  • The age of the cow is known from the child level stage
  • The size of the transition matrices are reduced to 3 x 3 (as

compared to 36 x 36 in the original model)

Advanced Quantitative Methods in Herd Management Slide 32 Department of Veterinary and Animal Sciences

Multi-level processes

The hierarchy may be extended to several levels Curse of dimensionality circumvented Simultaneous optimization of decisions at different levels (time horizon)

Advanced Quantitative Methods in Herd Management Slide 33 Department of Veterinary and Animal Sciences

slide-12
SLIDE 12

12/17/2019 12

Example: Dimensionality

State variables in original model (van Arendonk 1985):

  • Age (months) (1-144)
  • Milk yield, previous lactation (1-15)
  • Milk yield, present lactation (1-15)

Total number of states: 29,880 Stage length: 1 month Matrix dimension 29,880 x 29,880

Advanced Quantitative Methods in Herd Management Slide 34 Department of Veterinary and Animal Sciences

As a 2 or 3-level process

Advanced Quantitative Methods in Herd Management Slide 35 Department of Veterinary and Animal Sciences

Model tree of a hierarchical MDP

In any action of an MDP, the choice of action influences:

  • The immediate reward
  • The probability distribution of the state at next stage

In a hierarchical model the action (of a parent process) is modeled as a separate embedded finite time MDP (a child process):

  • The reward is the expected sum of total rewards of the

child

  • The transition probability distribution is calculated as the

matrix product of all transition matrices of the child process. The action of a parent process is an ordinary action of which the reward and transition probabilities are calculated in a special way (from the child process). In the model tree we just ad a child process to the action!

Advanced Quantitative Methods in Herd Management Slide 36 Department of Veterinary and Animal Sciences

slide-13
SLIDE 13

12/17/2019 13

Model tree of a hierarchical MDP

Advanced Quantitative Methods in Herd Management Slide 37 Department of Veterinary and Animal Sciences

Markov chain simulation

Markov chain simulation

As a supplement to the optimal policy various technical and economical key figures characterizing the optimal policy may be calculated by Markov chain simulation. The MLHMP software implements this (refer to exercises). Sometimes considered as a separate modeling technique (which it is not). Done by re-defining rewards and outputs and solving a set of linear equations.

Advanced Quantitative Methods in Herd Management Slide 39 Department of Veterinary and Animal Sciences

slide-14
SLIDE 14

12/17/2019 14

Concluding remarks - Difficulties when modeling

The curse of dimensionality

  • (Multi-level) Hierarchical processes

Decisions on multiple time scales

  • (Multi-level) Hierarchical processes

Assumption of mutual independence (animals, pens, etc.)

  • Inherit problem of the method
  • Parameter iteration

The Markov property

  • Memory variables
  • Bayesian

Advanced Quantitative Methods in Herd Management Slide 40 Department of Veterinary and Animal Sciences