Abstract
In order to deal with the complex dynamics and control problems involved in space debris removal, a trajectory planning technique for a spatial robotic arm based on twin delayed DDPG (TD3) in deep reinforcement learning is proposed, and it can accomplish an end-to-end control effect comparable to that of human hand gripping objects. The trajectory planning method for capturing space debris by a floating-base space robotic arm is realized using a space robotic arm task simulation platform built on MuJoCo and using trajectory planners, trajectory trackers, and joint and end-effector control strategies formulated with seven different weighted reward functions. This makes it easier to complete spacecraft in-orbit servicing and maintenance missions. The experiment results demonstrate that the capture strategy can maintain a capture success rate of more than 99%, and debris capture can be mostly finished in three stages when taking the stability of the floating base into consideration by continuously modifying the trajectory.