Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte - - PowerPoint PPT Presentation

optimal control and dynamic programming
SMART_READER_LITE
LIVE PREVIEW

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte - - PowerPoint PPT Presentation

Optimal Control and Dynamic Programming 4SC000 Q2 2017-2018 Duarte Antunes Introduction Outline Course info Introduction to optimal control and applications Dynamic programming algorithm Course information Teaching staff


slide-1
SLIDE 1

4SC000 Q2 2017-2018

Optimal Control and Dynamic Programming

Duarte Antunes

slide-2
SLIDE 2

Introduction

slide-3
SLIDE 3

Outline

  • Course info
  • Introduction to optimal control and applications
  • Dynamic programming algorithm
slide-4
SLIDE 4

Course information

Teaching staff

  • Lecturer: Duarte Antunes (d.antunes@tue.nl)
  • Assistants: Eelco van Horssen (e.p.v.horssen@tue.nl)

Ruben di Filippo (r.di.filippo@student.tue.nl)

Grading

  • 1 Exam and 3 homework assignments to be submitted via Matlab cody coursework*
  • Assignments can be solved individually or in a group (max 4 people); select group via canvas.
  • If solved in a group, presence at BZ 7 & 14 is mandatory to discuss individual grade of

assignments 1, 2 respectively. Peer assessment of homework 3 should be sent by Feb 4th.

  • Contributions to the final grade: each of the 3 homework assignments, 40/3%; exam, 60%.

1

*https://coursework.mathworks.com/, students will receive an email to register

Question hours

Monday Tuesday Wednesday Thursday Friday 12h30-13h30 Eelco GEM-Z 0.55 17h30-18h30 Ruben GEM-Z 0.55 15h45-17h30 BZ Paviljoen U46 12h45-13h45 Duarte GEM-Z -1.139 10h45-12h30 BZ Paviljoen U46

ºsince there are questions hours every working day, no further appointments will be scheduled and we try to avoid answering questions via e-mail.

slide-5
SLIDE 5

Course schedule

Monday Tuesday Wednesday Thursday Friday 13 14 15 16 17 20 21 22 23 24 27 28 29 30 1

November

L2 BZ2

Lectures (L): Wednesdays 13h45-15h30 (LUNA 1.050), Fridays 8h45-10h30 (Paviljoen B2). Guided self-study (BZ): Wednesdays 15h45-17h30 (Paviljoen U46) Fridays 10h45-12h30 (Paviljoen U46) Deadlines: PSI: Dec. 5th, 23h45 ; PSII: Jan 11rd 23h45; PSIII: Feb 4th, 23h45. Exam: February 1st, 13h30-16h30, retake: April 12, 18h00-21h00.

2 4 5 6 7 8 11 12 13 14 15 18 19 20 21 22

December January & February

8 9 10 11 12 15 16 17 18 19 22 23 24 25 26 29 30 31 1 2 Exam L1 BZ1 L4 BZ4 L3 BZ3 L5 BZ5 L6 BZ6 L7 BZ7* L9 BZ9 L8 BZ8 L10 BZ10 L12 BZ12 L11 BZ11 L14 BZ14* L13 BZ13 L16 BZ16 L15 BZ15

* In BZ7 and BZ14 the grade of each member of a group pertaining to homework 1 and 2, respectively, will be discussed.

slide-6
SLIDE 6

Course material

Main course material

  • Slides and Problem sets
  • Dynamic Programming and Optimal Control, Dimitri P

. Bertsekas, Athena Scientific, Volume I, 2005. ISBN-13: 1-886529-08-6, Chapters 1-6*

Further reading

  • Applied Optimal Control, Jr. Arthur E. Bryson,

Yu-Chi Ho CRC Press, 1975, ISBN-13: 978-0891162285, Chapters 1-3

  • Calculus of variations and optimal control theory, D. Liberzon,

Princeton University Press, 2012. ISBN-13: 978-0-691-15187-8 Chapters 1-4

  • Planning algorithms, Steven M. LaValle,

Cambridge university press, 2016, ISBN-13:978-0521862059 Chapters 1-12, 14

  • Other reference books [1]-[10]

3 *Slides and video lectures available at http://www.athenasc.com/dpbook.html

slide-7
SLIDE 7

Outline of the course

4

Topic Lectures I Discrete

  • ptimization

problems

  • 1. Introduction and the dynamic programming algorithm
  • 2. Stochastic dynamic programming
  • 3. Shortest path problems in graphs
  • 4. Bayes filter and partially observable Markov decision processes

II Stage decision problems

  • 5. State-feedback controller design for linear systems -LQR
  • 6. Optimal estimation and output feedback- Kalman filter and LQG
  • 7. Discretization
  • 8. Discrete-time Pontryagin’s maximum principle
  • 9. Approximate dynamic programming

III Continuous- time optimal control problems

  • 10. Hamilton-Jacobi-Bellman equation and deterministic LQR in continuous-time
  • 11. Linear quadratic control in continuous-time - LQR/LQG
  • 12. Frequency-domain properties of LQR/LQG
  • 13. Pontryagin’s maximum principle I
  • 14. Pontryagin’s maximum principle II

15 & 16. Revision/sample exam

slide-8
SLIDE 8

Position in the MSc programs

5

Systems and control oriented programs

  • Systems and control, Mechanical and Electrical Engineering with control specialisation
  • Clear track

Q1 System theory for control, Q2 Optimal control, Q3 Model predictive control

  • Optimal control is one of the cornerstones of control systems theory

Other programs

  • Optimal control and dynamic programming is very broad and may be useful for you.
  • For example for the Automotive students: optimal control appears in many

automotive applications, such as optimization of powertrains, optimal power management in hybrid vehicles, etc.

slide-9
SLIDE 9

Background

6

Matlab

  • Nice intro: https://matlabacademy.mathworks.com/.
  • Best way to learn: read Matlab documentation and gain experience.

Optimization

  • Notions of gradient, convex functions, constrains, see Appendix B of Bertsekas’ book.
  • Advanced book: Convex optimization, Boyd,

Vanderberghe available at http://stanford.edu/~boyd/cvxbook/.

System theory

  • Basic knowledge of concepts such as state-space representation, observability,

controllability is useful

  • Course ‘System Theory for Control’ taught at TU/e is enough.
  • If you have not taken the course, suggestion for a book: ‘Linear systems theory’,

2009, João Hespanha.

Probability theory

  • Basic notions, see Appendix C of Bertsekas’ book.
slide-10
SLIDE 10

Outline

  • Course info
  • Introduction to optimal control and applications
  • Dynamic programming algorithm
slide-11
SLIDE 11

7

Optimal control

Optimality

  • Useful design principle in many engineering contexts (optimize efficiency of a

refrigerator, minimize the fuel consumption of a car, etc.).

  • Nature is described by laws derived from optimality principles.
  • We optimize every day to make decisions (true?).

Optimal control

  • Deals with problems in which optimal decisions or control actions are pursued over a

time period in order to reach final and intermediate goals.

  • Arises in the control of physical systems (e.g. mechanical, electrical, biological) and in

many other contexts (e.g., economics, computer science, and game theory).

slide-12
SLIDE 12

8

Static optimization

  • Determine one optimal decision.
  • Examples: decide on the price of a product, determine the slope of a

straight line which best fits data, etc.

Optimal control vs static optimization

u∗ J(u∗) u J

slide-13
SLIDE 13

9

Optimal control

  • Determine several optimal decisions over time.
  • Decisions are functions of state, i.e., a control law to cope with disturbances.
  • Examples: driving a car/bike in a race, positioning the tip of a robot arm in the

presence of disturbances, playing chess, etc.

Optimal control vs static optimization

θ(3) Disturbance at time θ(0) θ(1) θ(2) ˜ θ(2) t = 2 θ(t) 6= ˜ θ(t) t 2

slide-14
SLIDE 14

10

Optimal control formulation

Dynamic model

  • Specifies the rules of the problem or the equations of the physical system.
  • State: summarizes relevant information to make future decisions.
  • Control actions: influence the evolution of the state over time.
  • State evolution may be deterministic or stochastic (driven by disturbances).

Cost function

  • Encapsulates the goals to be achieved in the problem.
  • Typically additive over time and by convention should be minimized.

Goal: find a control policy which minimizes the cost

  • Policy: set of functions mapping the state at each instant of time to an action.
  • Related problem: compute an optimal path/trajectory consisting of optimal decisions
  • ver time for a given initial state.
slide-15
SLIDE 15

Optimal control problems

11

time state space Discrete optimization problems* discrete discrete Stage decision problems discrete general Continuous-time optimal control problems continuous general

Three classes of problems will be considered in the course Some applications are discussed next and more applications

  • later. However there are many others - see Appendix B.
slide-16
SLIDE 16

12

Applications

Operational research, management, finance

  • inventory control, control of a queue, control of networks (data, traffic, etc.), etc.

Computer Science

  • shortest path in graphs, scheduling, selection problems, among others.

Other fields

  • Computational biology, automotive, games, many others.

Next slides address some applications treated in the course, where we will consider also cases where uncertainty is present.

Aerospace

  • minimum-fuel launch of a satellite, etc.

Traditional process control

  • controlling an inverted pendulum, mass-spring damper, double integrator, quadcopter, etc.
slide-17
SLIDE 17

13

Discrete Optimization Problems

1 n0 n1 2 2 1 1 1 2 c0

11

Stage 1 Stage 0 Stage h Stage h −1 c0

n01

c0

n02

c0

21

c0

22

c0

12

c1

11

c1

21

c1

22

c1

23

c1

n11

nh nh

− 1

ch−1

11

ch−1

21

ch−1

22

ch−1

nh−11 ch nh

ch

1

Specified by a transition diagram with decision stages

  • Dynamic model: circles indicate states at each of stages; arrows indicate actions

for each state which lead to states at next stages.

  • Costs are associated with actions for each state at each stage ; for the

terminal stage the costs depend only on the state . ck

i,j

i j k ch

i

i

h − 1

h h

slide-18
SLIDE 18

14

Discrete Optimization Problems

1 2 2 1 1 1 2

Challenges

(i) Determine an optimal path for a given initial state which minimizes the sum of costs incurred at every stage (including the terminal stage). (ii) Determine an optimal policy specifying for each state the first decision of the optimal path from that state to the terminal stage. 3 3 1 1 1 1 1 1 2 2 1 1 1 2 3 3 4 5 1 1 2 2 2 2 2 1 1 1 2 2 2 1 5 1 2 5 2 5 1 2

slide-19
SLIDE 19

Inventory control

How to manage the supply of products in a shop? Overstock is prejudicial (physical space limitations, technological

  • bsolescence, etc.) and under stock undermines sales.

15

slide-20
SLIDE 20

16

Shortest paths in graphs

What is the shortest distance from Bucharest to Lugoj?

Lugoj Neamt Iasi Vaslui Hirsova Eforie Urziceni Bucharest Giurgiu Fagaras Pitesti Craiova Sibiu Rimnicu Vilcea Oradea Zerind Arad Timisoara Mehadia Dobreta 71 75 118 111 70 75 120 146 97 138 80 99 211 101 86 98 142 92 87 85 90 151 140

Road map of Romania

*Source: Artificial intelligence: a modern approach, Stuart J. Russel, Peter Novig, 3rd edition, 2016

slide-21
SLIDE 21

5 10 15 20 25 30

17

Robot path planning

A B

What is the shortest path for a robot to go from point A to B?

slide-22
SLIDE 22

18

Games

How to make profit in expectation in a game such as blackjack? As portrayed in the movie ‘21’ the MIT blackjack team had an answer to this problem (using optimal control?). The same movie unveils the principle to achieve this, using a famous game show problem https://www.youtube.com/watch?v=Zr_xWfThjJ0

slide-23
SLIDE 23

19

Stage decision problems

h−1

X

k=0

gk(xk, uk) + gh(xh) xk+1 = fk(xk, uk) Dynamic model Cost function Goals (i) find policy that leads to the minimum cost for every initial condition. k ∈ {0, . . . , h − 1} π = {µ0, . . . , µh−1} uk = µk(xk) {(x0, u0), (x1, u1), . . . , (xh−1, uh−1)} (ii) find path that leads to the minimum cost for a given initial condition.

slide-24
SLIDE 24

20

Stage decision problems

Generalization of discrete optimization problem considering general state and input spaces, e.g., Rn

X0 X1 x0 x1 g0(x0, u0) g1(x1, u1) gh−

1(xh− 1, uh− 1)

Stage 0 Stage 1 Stage h − 1 Stage h Xh−1 Xh xh xh−1 gh(xh)

slide-25
SLIDE 25

21

Digital control

Physical system D/A A/D

Sensors Actuators

Digital controller

Analog Digital Converter

Prime application: how to design a digital controller for a physical system? Several variants: full state is available or only an output, system can have disturbances or not, etc.

slide-26
SLIDE 26

22

Mixing

Control law

Camera images Actuators Actuation: 4 possible rotations decided once every h seconds

How to mix two fluids in minimum time?

slide-27
SLIDE 27

23

Digital control of a unicycle robot

Given a unicycle robot with constraints on speed and rotation rate and controlled digitally, how to do a curve maneuver in minimum time?

slide-28
SLIDE 28

24

Continuous-time optimal control problems

Dynamic model Cost function Z h g(t, x(t), u(t))dt + gh(x(h)) ˙ x(t) = f(t, x(t), u(t)), x(0) = x0, t ∈ [0, h] Goals

  • Find a feedback policy minimizing the cost function for every initial

condition.

  • Find a control input minimizing the cost function for a given initial

condition. u(t) = µ(t, x(t)) u(t), t ∈ [0, h],

slide-29
SLIDE 29

25

Continuous-time optimal control problems

Rn Rn x(0) ˙ x(t) = f(t, x(t), u(t)) t = 0 x(h) t = h

Most applications in control systems: motion control, aerospace, etc.

slide-30
SLIDE 30

26

Minimum energy control

How to move a motion system described by a linear equation from point A to point B with minimum energy? ˙ x(t) = Ax(t) + Bu(t) x(0) = x0 x(T) = xdesired

A B

minu R T

0 g(u(t))dt

Fx Fy y x

slide-31
SLIDE 31

27

Minimum time control

How to move a (linearized model of a) quadcopter from one hovering position to another one in minimum time? ˙ x(t) = Ax(t) + Bu(t) x(0) = x0 x(T) = xdesired min T x0 xT

slide-32
SLIDE 32

28

Energy management of Hybrid ElectricVehicles

Hybrid electric vehicles have a battery where energy can be stored (e.g. during braking). Given a drive cycle, how to design the power slip between the battery and the internal combustion engine to minimize fuel consumption?

slide-33
SLIDE 33

Outline

  • Course info
  • Introduction to optimal control and applications
  • Dynamic programming algorithm
slide-34
SLIDE 34

29

Dynamic progamming

  • Dynamic programming is an approach to solve optimal control

problems.

  • It allows to find functions mapping states into actions. These

functions we call policies or control laws.

  • One can use these functions to control a system in the presence of

disturbances.

  • It also allows to compute optimal paths/trajectories, although, as

we shall see, other methods might be more efficient.

slide-35
SLIDE 35

30

The principle of optimality

Example: shortest route from Eindhoven to Paris passes through Antwerp. Then the piece of the route from Antwerp to Paris is the shortest route between the two cities.

slide-36
SLIDE 36

Stage 1

31

The principle of optimality

The tail of an optimal path is also optimal

  • Given an optimal path for a discrete optimization problem from stage to stage

consider the state at stage belonging to the optimal path.

  • Then the decisions along the optimal path from stages to are also optimal for the

discrete optimization problem with initial stage , initial state and final stage xj j h j h xj j 1 2 2 1 1 1 2 3 3 4 5 1 1 2 2 2 2 1 1 1 2 1 2 3 j Stage 0 Stage h 2 2 2 1 2 1 5 5 Stage 1 1 1 2 3 5 1 j Stage h 2 2 2 1 2 5 5 1 2 3 4 4

  • The principle of optimality also holds for stage decision problems and continuous-time
  • ptimal control problems and is the basis of the dynamic programming algorithm

h

slide-37
SLIDE 37

32

Proof of the principle of optimality

  • 1. Consider an optimal path from stage to stage .
  • 2. Suppose that the piece of the path from stage to stage is not optimal.

3.Then there exists a path with smaller cost from stage to stage - contradiction! h j cost[0,j) + cost[j,h] < cost[0,j) + cost[j,h] h h j cost = cost[0,j) + cost[j,h] cost[j,h] < cost[j,h] h j h

slide-38
SLIDE 38

33

The dynamic programming algorithm

The dynamic programming algorithm for discrete optimization problems:

(1) Start at the final decision stage and denote the terminal cost by cost-to-go at stage , (2) For every state at stage compute the optimal action as follows Denote the minimum by cost-to-go at stage , . (3) Repeat (2) for stages moving backwards. h Jh(i) = ch

i

k = h − 1 k ∈ {h − 2, h − 3, . . . , 1, 0} k Jk(i) minj∈actions/arrows ck

ij + Jk+1(state at stage k+1 when j is picked)

j i Then, the function which maps each state to the action obtained in (2) is an optimal policy.

Main idea

  • Find first the optimal policy and paths from stage to , and then use these to compute

the optimal policy and paths from stage to stage (principle of optimality). h h j j − 1

slide-39
SLIDE 39

34

Example

1 2 3 4 5 1 2 3 4 5 2 3 4 1 2 4 1 1 1 4

Iteration 1 - Stage 3 1 2 3 Cost-to-go State

1 2 3

min{5 + 0, 2 + 4} = 5 min{4 + 0, 2 + 4} = 4

5

min{0 + 4, 5 + 0} = 4

3 1

slide-40
SLIDE 40

35

Example

1 2 3 1 3 4 5 2 3 4 1 1 1 1 4

Iteration 2 - Stage 2

2 3 1 4

1 2 3 Cost-to-go State 4 min{4 + 5, 3 + 4} = 7 min{1 + 5, 3 + 4} = 6

5 4 4

1 + 4 = 5 min{2 + 4, 5 + 4} = 6

3 1

slide-41
SLIDE 41

36

Example

1 2 3 1 4 1 1 4

Iteration 3 - Stage 1 1 2 3 Cost-to-go State

5

7 6 6

3 1 1 2 3

min{4 + 6, 3 + 5} = 8 min{0 + 6, 1 + 6} = 6 min{1 + 7, 3 + 6, 1 + 6} = 7

slide-42
SLIDE 42

37

Example

1 2 4

Iteration 4 - Stage 0 1 2 Cost-to-go State 6 8

1 1 2

7 min{2 + 7, 1 + 6} = 7 min{1 + 8, 4 + 6} = 9

slide-43
SLIDE 43

38

Optimal policy and optimal paths

7 9 Optimal policy

1 2 3 4 5 1 2 3 4 5 2 3 4 1 2 4 1 1 1 4 3 1

Initial transition diagram

Optimal policy

  • While running the DP algorithm, for each state at each stage a decision is made to compute

the cost-to-go. That decision is precisely the decision specified by the optimal policy.

Optimal path

  • For a given initial state, follow the arrows leading to the final stage. This is the optimal path.
  • The cost-to-go at stage 0 of that initial state coincides with the cost of the optimal path.

5

slide-44
SLIDE 44

39

Non-uniqueness

If more than one option has the same cost while running the dynamic programming algorithm, simply pick one of the options. At the end one

  • ptimal policy is obtained (while several may be optimal).

Stage 1, state 2 - both decisions have the same cost 3.

2 3 2 1 1 2 1

The optimal policy and the optimal paths may not be unique Two optimal paths Two optimal policies

1 2 1 3 3 3 2 1 2 1 1 2 1 2 1 2 1 2 3 2 2 1 2 1 2 1

slide-45
SLIDE 45

Inventory control

Ph−1

k=0 gk(xk, uk) + gh(xh)

Controlling the supply of one product

  • Dynamic model
  • Cost

number of items supply demand capacity terminal cost selling price purchase price storage cost xk uk N gh transportation price c1 c ctr p

40

xk+1 = max{xk + uk − dk, 0} uk ∈ {0, 1, . . . , N − xk} gk(xk, uk) = (c1(xk) + cuk + ctrkukk0) p min{dk, xk + uk} kukk0 = ( 0 if uk = 0 1 if uk 6= 0 dk

slide-46
SLIDE 46

41

Formulation as discrete optimization problem

Transition diagram

  • circles at each stage indicate number of items,

1 1 Stage 1 Stage 0 N N N 1 supplies determine transitions i i−dk i Stage 4 Stage 4 k uk = 0 uk = dk Stage h uk = N − i

N −dk

slide-47
SLIDE 47

42

Inventory control

Ph−1

k=0 gk(xk, uk) + gh(xh)

number of items demand supply selling price purchase price xk uk transportation price p = 10 c = 5 ctr = 0.5 N = 4 d0 = d1 = 2, d2 = d3 = 1 capacity number of stages h = 4 terminal cost storage cost c1(i) = 0.2i, i ∈ {0, . . . , N} g4(i) = −ri+1, i ∈ {0, . . . , 4},

What are the optimal supplies for a zero initial inventory?

r = [0 4.8 9.6 14.4 19.2] xk+1 = max{xk + uk − dk, 0} uk ∈ {0, 1, . . . , N − xk} gk(xk, uk) = c1(xk) + cuk + ctrkukk0 p min{dk, xk + uk}

slide-48
SLIDE 48

1 2 3 4 1 1 1 2 3 4 1 2 3 4 −19.2 −19.2 −23.6 −14.4 −14.4 −19 −9.6 −9.6 −4.8 −4.8 −14.4 −4.5 −9.8 −4.5 −4.5 0.5 5.5 10.5 10.5 5.5 0.5 −9.8 −4.3 0.7 5.7 Stage 3 Stage 4 Stage 2 Stage 3 Stage 4 Stage 3

Some iterations

43

slide-49
SLIDE 49

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 Stage 4 Stage 4 Stage 3 Stage 2 Stage 1 Stage 0 3

4 4

1 2 d0 = 2 d1 = 2 d2 = 1 d3 = 1 −48.1 −43.1 −38.5 −33.2 −28.4 −38.6 −33.7 −28.9 −23.7 −18.9 −28.2 −23.8 −19.4 −14.3 −9.3 −4.5 −9.8 −14.4 −19 −23.6 −14.4 −19.2 −9.6 −4.8 1 u0 u1 u2 u3

Final policy and optimal path

44

−28.4 Cost for a zero initial inventory

slide-50
SLIDE 50

45

Richard E. Bellman

Historical note

  • Dynamic programming was proposed in the 1940s by Richard E. Bellman

‘I was intrigued by dynamic programming. It was clear to me that there was a good deal

  • f good analysis there. Furthermore, I could see many applications. It was a clear choice. I

could either be a traditional intellectual, or a modern intellectual using the results of my research for the problems of contemporary society.’

  • R. E. Bellman (1920-1984)
slide-51
SLIDE 51

Concluding remarks

46

Summary

  • Optimal control: Determine several optimal decisions over time and as a function of the

state.

  • Optimal control problems: three classes (discrete optimization, stage-decision, continuous-

time control), many applications.

  • Dynamic programming: Optimal decisions computed from the end to the initial stage.

After this lecture, you should be able to:

  • Apply the dynamic programming algorithm.
  • Solve deterministic inventory control problems.
slide-52
SLIDE 52

Appendix A

Inventory control with uncertainty

slide-53
SLIDE 53

Coping with disturbances

A1

What if the demand is instead of the expected ?

  • The state at stage 3 is then instead of .
  • Open loop: blindly pick as initially planned.
  • Closed-loop (using DP policy): pick .

d2 = 1 d2 = 2

x3 = 0 x3 = 1 u3 = 1 u3 = 0

1

2

1

2

Stage 4 Stage 4 Stage 3 Stage 2 1

2

−9.3 −4.5 −9.8 −14.4 −9.6 −4.8

In the context of the example of slides 40-44

slide-54
SLIDE 54

Open loop vs closed loop

A2

g0(0, 4) + g1(2, 0) + g2(0, 2) + g3(0, 0) + g4(0) g0(0, 4) + g1(2, 0) + g2(0, 2) + g3(0, 1) + g4(0) 0.5 − 19.6 + −9.5 + + 0 = −28.6 Prob[d2 = 1] = 0.5, Prob[d2 = 2] = 0.5 0.5 × (−28.4) + 0.5 × (−28.6) = −28.5 0.5 × (−28.4) + 0.5 × (−33.1) = −30.75

Costs

  • Open loop
  • Closed loop

Expected cost if

  • Open loop
  • Closed loop

0.5 − 19.6 + −9.5 + −4.5 + 0 = −33.1

slide-55
SLIDE 55

Appendix B

References and applications

slide-56
SLIDE 56

References

[1] D. Bertsekas, Dynamic Programming and Optimal Control, 3rd edition, Athena Scientific, Vol I and II, 2005 [2] A. E. Bryson and

  • Y. C. Ho, Applied Optimal Control, CRC Press,1975

[3] D. Liberzon, Calculus of Variations and optimal control, Princeton University Press, 2012, [4] M. Athans and P .L. Falb. Optimal control. McGraw Hill, New York, 1966. Reprint by Dover in 2006. [5] B. D. O. Anderson and J. B. Moore. Optimal control: Linear quadratic methods. Prentice Hall, New Jersey, 1990. Reprinted by Dover in 2007 [6] F. L. Lewis, Draguna Vrabie, Vassilis L. Syrmos, Optimal control, 3rd edition, John Wiley & Sons, 2012 [7] D. E. Kirk, Optimal control theory: an introduction, Dover books on electrical engineering, 2004 [8] R. Bellman. Dynamic programming. Princeton University, 1957. [9] K. Zhou, J. C. Doyle, and K. Glover. Robust and Optimal Control. Prentice Hall, New Jersey, 1996. [10] G. Chen, G. Chen, S-H, Hsu. Linear Stochastic Control Systems, CRC Press Book, 1995. [11] M. H. Davis, Linear Estimation and Stochastic Control, Chapman and Hall, 1977. [12] P . Whittle, Optimal control: basics and beyond. John Wiley and Sons Ltd, 1996. [13] D. Bertsekas and S. E. Shreve, Stochastic optimal control: the discrete-time case, Athena Scientific, 1996

Textbooks

[14] John Betts, Practical Methods for Optimal Control and Estimation Using Nonlinear Programming, SIAM, 2010

Numerical methods

slide-57
SLIDE 57

References

[A2] J. M. Longuski, J. J. Guzmán, J.E. Prussing, Optimal control with aerospace applications, Springer, 2014

Applications

[A1] A. E. Bryson, JR.. Applications of optimal control theory in aerospace engineering Journal of Spacecraft and Rockets,

  • Vol. 4, No. 5 (1967), pp. 545-553

[A4] Eddy, S. R., What is dynamic programming?, Nature Biotechnology, 22, 909–910 (2004)

Aerospace

[A3] G . W. Swan. Applications of optimal control theory in biomedicine. Marcel Dekker, New York, 1984. [A6] G. S. Christensen, M. E. El-Hawary, and S. A. Soliman. Optimal control applications in electric power

  • systems. Plenum press, New

York, 1987.

Biomedicine and sequential alignment of DNA

[A5] M. G. Neubert. Marine reserves and optimal harvesting. Ecology Letters, 6:843-849, 2003.

Power Systems

slide-58
SLIDE 58

References

Applications

[A9] S.P . Sethi and G.L. Thompson. Optimal control theory: applications to management science and economics, Springer, New York, 2nd edition, 2005.

Operational research and inventory control Finance and economics

[A7] M. H. Davis, Markov models and optimization, Chapman and Hall/CRC, 1993. [A8] A. Bensoussan, Dynamic programming and inventory control, IOS press, 2011. [A10] Art Lew and Holger Mauch, Dynamic Programming: A Computational Tool, Springer Verlag, 2007

Computer science and scheduling problems Automotive

[A11] Bram de Jager, Thijs van Keulen, John Kessels Optimal Control of Hybrid Vehicles, Springer, 2013

slide-59
SLIDE 59

References

[S1] Anonymous, "Letter sent to Charles Montague, President of the Royal Society, where two mathematical problems proposed by the celebrated Johann Bernoulli are solved" . Acta Eruditorum Lipsi", (1697) 223. [S2] Bellman, Richard. The theory of dynamic programming. Bull. Amer. Math. Soc. 60 (1954), no. 6, 503--515 [S3] R. E. Kalman. Contributions to the theory of optimal control. Bol. Soc. Mat. Mexicana, 5:102-119, 1960. Reprint in Control Theory: Twenty-Five Seminal Papers (1931-1981), T. Basar, editor, IEEE Press, New York, 2001, pages 149-166. [S4] L. S. Pontryagin,

  • V. G. Boltyanskii, R.
  • V. Gamkrelidze, and E. F. Mishchenko. The Mathematical Theory of Optimal
  • Processes. Interscience, New

York, 1962. [S5] J. C. Doyle, K. Glover, P . P . Khargonekar, and B. A. Francis. State-space solutions to standard H2 and Hinf control

  • problems. IEEE Trans. Automat. Control 34:831-847, 1989

History Seminal papers

[H1] A. E. Bryson Jr. Optimal control - 1950-1985. IEEE Control Systems Magazine, 1996. [H2] Héctor J. Sussmann and Jan C. Willems, 300 Years Of Optimal Control: From The Brachystochrone To The Maximum Principle (1997) [H3] Richard Ernest Bellman Eye of the Hurricane: An Autobiography, 1984 [H4] Stuart Dreyfus (2002). Richard Bellman on the Birth of Dynamic Programming. In: Operations Research.

  • Vol. 50,
  • No. 1, Jan–Feb 2002, pp. 48–51