Robust Neuro-Optimal Control of Underactuated Snake Robots with Experience Replay

Zhengcai Cao, Qing Xiao, Ran Huang, Mengchu Zhou

Research output: Contribution to journalArticlepeer-review

55 Scopus citations


In this paper, the problem of path following for underactuated snake robots is investigated by using approximate dynamic programming and neural networks (NNs). The lateral undulatory gait of a snake robot is stabilized in a virtual holonomic constraint manifold through a partial feedback linearizing control law. Based on a dynamic compensator and Line-of-Sight guidance law, the path-following problem is transformed to a regulation problem of a nonlinear system with uncertainties. Subsequently, it is solved by an infinite horizon optimal control scheme using a single critic NN. A novel fluctuating learning algorithm is derived to approximate the associated cost function online and relax the initial stabilizing control requirement. The approximate optimal control input is derived by solving a modified Hamilton-Jacobi-Bellman equation. The conventional persistence of excitation condition is relaxed by using experience replay technique. The proposed control scheme ensures that all states of the snake robot are uniformly ultimate bounded which is analyzed by using the Lyapunov approach, and the tracking error asymptotically converges to a residual set. Simulation results are presented to verify the effectiveness of the proposed method.

Original languageEnglish (US)
Article number8110828
Pages (from-to)208-217
Number of pages10
JournalIEEE Transactions on Neural Networks and Learning Systems
Issue number1
StatePublished - Jan 2018

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence


  • Experience replay
  • Hamilton-Jacobi-Bellman (HJB) equation
  • neural networks (NNs)
  • snake robot


Dive into the research topics of 'Robust Neuro-Optimal Control of Underactuated Snake Robots with Experience Replay'. Together they form a unique fingerprint.

Cite this