基于迭代神经动态规划的数据驱动非线性近似最优调节

利用数据驱动控制思想,建立一种设计离散时间非线性系统近似最优调节器的迭代神经动态规划方法.提出针对离散时间一般非线性系统的迭代自适应动态规划算法并且证明其收敛性与最优性.通过构建三种神经网络,给出全局二次启发式动态规划技术及其详细的实现过程,其中执行网络是在神经动态规划的框架下进行训练.这种新颖的结构可以近似代价函数及其导函数,同时在不依赖系统动态的情况下自适应地学习近似最优控制律.值得注意的是,这在降低对于控制矩阵或者其神经网络表示的要求方面,明显地改进了迭代自适应动态规划算法的现有结果,能够促进复杂非线性系统基于数据的优化与控制设计的发展.通过两个仿真实验,验证本文提出的数据驱动最优调节方...

Full description

Saved in:

Bibliographic Details
Published in	自动化学报 Vol. 43; no. 3; pp. 366 - 375
Main Author	王鼎穆朝絮刘德荣
Format	Journal Article
Language	Chinese
Published	中国科学院自动化研究所复杂系统管理与控制国家重点实验室北京 100190 2017 天津市过程检测与控制重点实验室, 天津大学电气自动化与信息工程学院天津 300072%天津市过程检测与控制重点实验室,天津大学电气自动化与信息工程学院天津,300072%北京科技大学自动化学院北京 100083
Subjects	数据驱动控制神经网络自适应动态规划迭代神经动态规划非线性近似最优调节 data-driven control 非线性近似最优调节 nonlinear near-optimal regulation 数据驱动控制 Adaptive dynamic programming 迭代神经动态规划 iterative neural dynamic programming 自适应动态规划 neural net-works 神经网络
Online Access	Get full text
ISSN	0254-4156 1874-1029
DOI	10.16383/j.aas.2017.c160272

Cover

More Information
Summary:	利用数据驱动控制思想,建立一种设计离散时间非线性系统近似最优调节器的迭代神经动态规划方法.提出针对离散时间一般非线性系统的迭代自适应动态规划算法并且证明其收敛性与最优性.通过构建三种神经网络,给出全局二次启发式动态规划技术及其详细的实现过程,其中执行网络是在神经动态规划的框架下进行训练.这种新颖的结构可以近似代价函数及其导函数,同时在不依赖系统动态的情况下自适应地学习近似最优控制律.值得注意的是,这在降低对于控制矩阵或者其神经网络表示的要求方面,明显地改进了迭代自适应动态规划算法的现有结果,能够促进复杂非线性系统基于数据的优化与控制设计的发展.通过两个仿真实验,验证本文提出的数据驱动最优调节方法的有效性.
Bibliography:	WANG Ding1,2, MU Chao-Xu2, LIU De-Rong3 Adaptive dynamic programming;data-driven control;iterative neural dynamic programming;neural networks;nonlinear near-optimal regulation An iterative neural dynamic programming approach is established to design the near optimal regulator of discrete-time nonlinear systems using the data-driven control formulation. An iterative adaptive dynamic programming algorithm for discrete-time general nonlinear systems is developed and proved to guarantee the property of convergence and optimality. Then, a globalized dual heuristic programming technique is developed with detailed implementation by constructing three neural networks, where the action network is trained under the framework of neural dynamic programming. This novel architecture can approximate the cost function with its derivative, and simultaneously, adaptively learn the near-optimal control law without depending on the system dynamics. It is significant to observe that it greatly improves the existing results of itera
ISSN:	0254-4156 1874-1029
DOI:	10.16383/j.aas.2017.c160272