Reinforcement Learning Tracking Control for Robotic Manipulator with Kernel-Based Dynamic Model

Yazhou Hu, Wenxue Wang*, Hao Liu, Lianqing Liu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)

Abstract

Reinforcement learning (RL) is an efficient learning approach to solving control problems for a robot by interacting with the environment to acquire the optimal control policy. However, there are many challenges for RL to execute continuous control tasks. In this article, without the need to know and learn the dynamic model of a robotic manipulator, a kernel-based dynamic model for RL is proposed. In addition, a new tuple is formed through kernel function sampling to describe a robotic RL control problem. In this algorithm, a reward function is defined according to the features of tracking control in order to speed up the learning process, and then an RL tracking controller with a kernel-based transition dynamic model is proposed. Finally, a critic system is presented to evaluate the policy whether it is good or bad to the RL control tasks. The simulation results illustrate that the proposed method can fulfill the robotic tracking tasks effectively and achieve similar and even better tracking performance with much smaller inputs of force/torque compared with other learning algorithms, demonstrating the effectiveness and efficiency of the proposed RL algorithm.
Original languageEnglish
Article number8890006
Pages (from-to)3570-3578
Number of pages9
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume31
Issue number9
DOIs
Publication statusPublished - Sep 2020

Scopus Subject Areas

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

User-Defined Keywords

  • Kernel function
  • reinforcement learning (RL)
  • reward function
  • robotics tracking control

Fingerprint

Dive into the research topics of 'Reinforcement Learning Tracking Control for Robotic Manipulator with Kernel-Based Dynamic Model'. Together they form a unique fingerprint.

Cite this