Optimization and Control IIB
Optimization and Control IIB
Table of Contents
(This is last's year's table of contents and will be updated soon.)
PART I. DYNAMIC PROGRAMMING 1
1 Dynamic Programming: The Optimality Equation 1
1.1 Control as optimization over time 1
1.2 The principle of optimality 1
1.3 Example: the shortest path problem 2
1.4 The optimality equation 2
1.5 Markov decision processes 4
2 Some Examples of Dynamic Programming 6
2.1 Example: managing spending and savings 6
2.2 Example: minimizing flow time on a single machine 7
An interchange argument 7
2.3 Example: optimal exercise of a stock option 8
2.4 Example: accepting the best offer 9
3 Dynamic Programming over the Infinite Horizon 11
3.1 Discounted costs 11
3.2 The infinite-horizon case 11
3.3 The optimality equation in the infinite-horizon case 12
3.4 Example: selling an asset 13
4 Positive Programming 15
4.1 Example: possible lack of an optimal policy. 15
4.2 Characterisation of the optimal policy 15
4.3 Example: optimal gambling 16
4.4 Value iteration 16
4.5 Example: pharmaceutical trials 18
5 Negative Programming 19
5.1 Stationary policies 19
5.2 Characterisation of the optimal policy 19
5.3 Optimal stopping problems 20
5.4 Example: optimal parking 20
5.5 Optimal stopping over the infinite horizon 21
6 Average-cost Programming 23
6.1 Average-cost optimization 23
6.2 Example: admission control at a queue 24
6.3 Value iteration bounds 25
6.4 Policy impro
PART II. LQG (linear/quadratic/Gaussian) SYSTEMS 29
7 LQ Models 29
7.1 The LQ regulation model 29
7.2 The Riccati recursion 31
7.3 Example: additive white noise 31
7.4 Continuous-time LQ regulation 32
7.5 Linear models as linearization 32
8 Controllability 33
8.1 Disturbances 33
8.2 Tracking 33
8.3 Controllability 34
8.4 Controllability in continuous-time 36
8.5 Example: broom balancing 36
9 Infinite Horizon Limits 37
9.1 Example: satellite in a plane orbit 37
9.2 Stabilizability 37
9.3 Stability in continuous-time 38
9.4 Example: pendulum 38
9.5 Infinite-horizon LQ regulation 39
10 Observability 41
10.1 Observability 41
10.2 Observability in continuous-time 43
10.3 Examples 43
10.4 Imperfect state observation with noise 44
11 Kalman Filtering and Certainty Equivalence 45
11.1 Preliminaries 45
11.2 The Kalman filter 46
11.3 Certainty equivalence 47vement 25
11.4 Example: inertialess rocket with noisy position sensing 48
PART III. CONTINUOUS-TIME MODELS 49
12 Dynamic Programming in Continuous Time 49
12.1 The optimality equation 49
12.2 Example: LQ regulation 50
12.3 Example: estate planning 50
12.4 Example: harvesting 51
13 Pontryagin's Maximum Principle 53
13.1 Heuristic derivation 53
13.2 Example: skateboarding 54
13.3 Example: Bush problem 55
13.4 Connection with Lagrangian multipliers 56
14 Applications of the Maximum Principle 57
14.1 Problems with terminal conditions 57
14.2 Example: monopolist 58
14.3 Example: insects as optimizers 58
14.4 Example: rocket thrust optimization 59
15 Controlled Markov Jump Processes 61
15.1 The dynamic programming equation 61
15.2 The case of a discrete state space 61
15.3 Uniformization in the infinite horizon case 62
15.4 Example: admission control at a queue 63
16 Controlled Diffusion Processes 65
16.1 Diffusion processes and controlled diffusion processes 65
16.2 Example: LQG in continuous time 66
16.3 Example: passage to a stopping set 66
16.4 Addendum on PMP: control subject to constraints 67