Hamilton–Jacobi–Bellman equation-dreamjdn-ChinaUnix博客

阳光的味道dreamjdn.blog.chinaunix.net

首页　| 　博文目录　| 　关于我

dreamjdn

博客访问： 695535
博文数量： 90
博客积分： 1631
博客等级：上尉
技术积分： 1413
用户组：普通用户
注册时间： 2008-04-15 22:43

文章分类

全部博文（90）

文章存档

2017年（8）

2016年（9）

2015年（11）

2014年（10）

2013年（9）

2012年（9）

2010年（2）

2009年（10）

2008年（22）

我的朋友

相关博文

Hamilton–Jacobi–Bellman equation

分类： Delphi

2012-02-16 10:50:59

Hamilton–Jacobi–Bellman equation

From Wikipedia, the free encyclopedia

Jump to: ,

The Hamilton–Jacobi–Bellman (HJB) equation is a which is central to theory. Classical variational problems, for example, the can be solved using this method. The HJB method can be generalized to systems as well.

The solution of the HJB equation is the 'value function', which gives the optimal cost-to-go for a given with an associated cost function. The solution is open loop, but it also permits the solution of the closed loop problem.

The equation is a result of the theory of which was pioneered in the 1950s by and coworkers. The corresponding discrete-time equation is usually referred to as the . In continuous time, the result can be seen as an extension of earlier work in on the by and .

Contents[]

[] Optimal control problems

Consider the following problem in deterministic optimal control

$\min \left\{ \int_0^T C[x(t),u(t)]\,dt + D[x(T)] \right\}$

where C[] is the scalar cost rate function and D[] is a function that gives the economic value or utility at the final state, x(t) is the system state vector, x(0) is assumed given, and u(t) for 0 ≤ t ≤ T is the control vector that we are trying to find.

The system must also be subject to

$\dot{x}(t)=F[x(t),u(t)] \,$

where F[] gives the vector determining physical evolution of the state vector over time.

[] The partial differential equation

For this simple system, the Hamilton Jacobi Bellman partial differential equation is

$\dot{V}(x,t) + \min_u \left\{ \nabla V(x,t) \cdot F(x, u) + C(x,u) \right\} = 0$

subject to the terminal condition

$V(x,T) = D(x),\,$

where the $a \cdot b$ means the of the vectors a and b and $\nabla$ is the operator.

The unknown scalar V(x,t) in the above PDE is the Bellman '', which represents the cost incurred from starting in state x at time t and controlling the system optimally from then until time T.

[] Deriving the equation

Intuitively HJB can be "derived" as follows. If V(x(t),t) is the optimal cost-to-go function (also called the 'value function'), then by Richard Bellman's , going from time tto t + dt, we have

$V(x(t), t) = \min_u \left\{ C(x(t), u) \, dt + V(x(t+dt), t+dt) \right\}.$

Note that the of the last term is

$V(x(t+dt), t+dt) = V(x(t), t) + \dot{V}(x, t) \, dt + \nabla V(x, t) \cdot \dot{x} \, dt + o(dt^2),$

where o(dt²) denotes the terms in the Taylor expansion of higher order than one. Then if we cancelV(x(t), t) on both sides, divide by dt, and take the limit as dt approaches zero, we obtain the HJB equation defined above.

[] Solving the equation

The HJB equation is usually , starting from t = T and ending at t = 0.

The HJB equation is a for an optimum. If we can solve for Vthen we can find from it a control u that achieves the minimum cost.

In general case, the HJB equation does not have a classical (smooth) solution. Several notions of generalized solutions have been developed to cover such situations, including ( and ), (), and others.

[] Extension to stochastic problems

The idea of solving a control problem by applying Bellman's principle of optimality and then working out backwards in time an optimizing strategy can be generalized to stochastic control problems. Consider similar as above

of the latter does not necessarily solve the primal problem, it is a candidate only and a further verifying argument is required. This technique is widely used in Financial Mathematics to determine optimal investment strategies in the market (see for example ).

[] See also

, discrete-time counterpart of the Hamilton-Jacobi-Bellman equation
, necessary but not sufficient condition for optimum, by minimizing a Hamiltonian

[] Notes

R. E. Bellman. Dynamic Programming. Princeton, NJ, 1957.
Dimitri P Bertsekas. Dynamic programming and optimal control. Athena Scientific, 2005.

[] References

F. Gozzi, S. S. Sritharan and A. Swiech, “Bellman equation for the optimal feedback control of stochastic Navier–Stokes equations", Communications on Pure and Applied Mathematics, 58(5), pp. 671–700 (2005).
F. Gozzi, S. S. Sritharan and A. swiech, “Viscosity solutions of dynamic programming equations for optimal control of Navier–Stokes equations", Archive for Rational Mechanics and Analysis, 163, 2002, 4, pp. 295–327.

[] Further reading

(2005). Dynamic programming and optimal control. Athena Scientific.

Retrieved from ""

: | | |

阅读(2663) | 评论(0) | 转发(0) |

上一篇：傅里叶变换，拉普拉斯变换和Z变换的意义

下一篇：解析解与数值解

给主人留下些什么吧！~~

感谢所有关心和支持过ChinaUnix的朋友们

16024965号-6