Publication: Online Reinforcement Learning Control of Nonlinear Dynamic Systems: A State-action Value Function Based Solution
No Thumbnail Available
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier B.V.
Abstract
In this paper, we present an online reinforcement learning-based solution to the optimal control problem of continuous-time nonlinear input-affine systems. The proposed approach contains a concurrent identifier that estimates time derivatives of states of the system in some arbitrary points. The identifier is utilized to simulate a so-called Bellman error in some unvisited points. The simulated errors together with errors obtained along the trajectory of the system are used to estimate the state-action value function, which is then employed to derive the estimated optimal controller. The designed approach does not explicitly require the input dynamics, which is hard to segregate it from the drift dynamics in optimal regulation problems. In addition, the simulated Bellman errors relax the confining persistence of excitation condition, which is needed for convergence in deterministic systems. A Lyapunov-based analysis was conducted to derive convergence conditions. Simulation studies demonstrated the effectiveness of the developed control scheme. © 2023 Elsevier B.V., All rights reserved.
