Optimization and Control IIB

Optimization and Control IIB

Table of Contents

(This is last's year's table of contents and will be updated soon.)

PART I. DYNAMIC PROGRAMMING 1

1 Dynamic Programming: The Optimality Equation 1 1.1 Control as optimization over time 1 1.2 The principle of optimality 1 1.3 Example: the shortest path problem 2 1.4 The optimality equation 2 1.5 Markov decision processes 4 2 Some Examples of Dynamic Programming 6 2.1 Example: managing spending and savings 6 2.2 Example: minimizing flow time on a single machine 7 An interchange argument 7 2.3 Example: optimal exercise of a stock option 8 2.4 Example: accepting the best offer 9 3 Dynamic Programming over the Infinite Horizon 11 3.1 Discounted costs 11 3.2 The infinite-horizon case 11 3.3 The optimality equation in the infinite-horizon case 12 3.4 Example: selling an asset 13 4 Positive Programming 15 4.1 Example: possible lack of an optimal policy. 15 4.2 Characterisation of the optimal policy 15 4.3 Example: optimal gambling 16 4.4 Value iteration 16 4.5 Example: pharmaceutical trials 18 5 Negative Programming 19 5.1 Stationary policies 19 5.2 Characterisation of the optimal policy 19 5.3 Optimal stopping problems 20 5.4 Example: optimal parking 20 5.5 Optimal stopping over the infinite horizon 21 6 Average-cost Programming 23 6.1 Average-cost optimization 23 6.2 Example: admission control at a queue 24 6.3 Value iteration bounds 25 6.4 Policy impro

PART II. LQG (linear/quadratic/Gaussian) SYSTEMS 29

7 LQ Models 29 7.1 The LQ regulation model 29 7.2 The Riccati recursion 31 7.3 Example: additive white noise 31 7.4 Continuous-time LQ regulation 32 7.5 Linear models as linearization 32 8 Controllability 33 8.1 Disturbances 33 8.2 Tracking 33 8.3 Controllability 34 8.4 Controllability in continuous-time 36 8.5 Example: broom balancing 36 9 Infinite Horizon Limits 37 9.1 Example: satellite in a plane orbit 37 9.2 Stabilizability 37 9.3 Stability in continuous-time 38 9.4 Example: pendulum 38 9.5 Infinite-horizon LQ regulation 39 10 Observability 41 10.1 Observability 41 10.2 Observability in continuous-time 43 10.3 Examples 43 10.4 Imperfect state observation with noise 44 11 Kalman Filtering and Certainty Equivalence 45 11.1 Preliminaries 45 11.2 The Kalman filter 46 11.3 Certainty equivalence 47vement 25 11.4 Example: inertialess rocket with noisy position sensing 48

PART III. CONTINUOUS-TIME MODELS 49

12 Dynamic Programming in Continuous Time 49 12.1 The optimality equation 49 12.2 Example: LQ regulation 50 12.3 Example: estate planning 50 12.4 Example: harvesting 51 13 Pontryagin's Maximum Principle 53 13.1 Heuristic derivation 53 13.2 Example: skateboarding 54 13.3 Example: Bush problem 55 13.4 Connection with Lagrangian multipliers 56 14 Applications of the Maximum Principle 57 14.1 Problems with terminal conditions 57 14.2 Example: monopolist 58 14.3 Example: insects as optimizers 58 14.4 Example: rocket thrust optimization 59 15 Controlled Markov Jump Processes 61 15.1 The dynamic programming equation 61 15.2 The case of a discrete state space 61 15.3 Uniformization in the infinite horizon case 62 15.4 Example: admission control at a queue 63 16 Controlled Diffusion Processes 65 16.1 Diffusion processes and controlled diffusion processes 65 16.2 Example: LQG in continuous time 66 16.3 Example: passage to a stopping set 66 16.4 Addendum on PMP: control subject to constraints 67