Ph.D. student Hannah Lehman and Dr. John Valasek of VSCL published the paper “Design, Selection, Evaluation of Reinforcement Learning Single Agents for Ground Target Tracking,” in Journal of Aerospace Information Systems.
Previous approaches for small fixed-wing unmanned air systems that carry strapdown rather than gimbaled cameras achieved satisfactory ground object tracking performance using both standard and deep reinforcement learning algorithms. However, these approaches have significant restrictions and abstractions to the dynamics of the vehicle such as constant airspeed and constant altitude because the number of states and actions were necessarily limited. Thus extensive tuning was required to obtain good tracking performance. The expansion from four state-action degrees-of-freedom to 15 enabled the agent to exploit previous reward functions which produced novel, yet undesirable emergent behavior. This paper investigates the causes of, and various potential solutions to, undesirable emergent behavior in the ground target tracking problem. A combination of changes to the environment, reward structure, action space simplification, command rate, and controller implementation provide insight into obtaining stable tracking results. Consideration is given to reward structure selection to mitigate undesirable emergent behavior. Results presented in the paper are on a simulated environment of a single unmanned air system tracking a randomly moving single ground object and show that a soft actor-critic algorithm can produce feasible tracking trajectories without limiting the state-space and action-space provided the environment is properly posed.
This publication is part of VSCL’s ongoing work in the area of Reinforcement Learning and Control. The early access version of the article can be viewed at https://arc.aiaa.org/journal/