
Reinforcement Learning (RL) is a machine learning paradigm that trains a decision maker, or policy, by learning from interaction with an environment. The power of RL lies in its ability to learn complex strategies without explicit human instruction, which can lead to better solutions that human designers overlook in domains ranging from robotics to scientific discovery. Despite these successes, applying RL to safety-critical control systems remains a significant challenge due to the fragility of black-box policies. Standard RL controllers are prone to “chattering” or indecisiveness, which is rapid, detrimental switching between decisions induced by small disturbances, and lack formal closed-loop safety, stability, and robustness guarantees. Furthermore, existing discrete and continuous-time RL paradigms struggle to model hybrid systems, where continuous state evolution is intertwined with instantaneous discrete updates. Consequently, standard RL approaches cannot effectively be applied to safety-critical hybrid dynamical systems, as such approaches suffer from discretization artifacts, computational inefficiency, and a lack of closed-loop safety, stability, and robustness guarantees.
To bridge the gap between hybrid control theory and RL, this research proposal is organized into four interconnected thrusts. Thrust 1 addresses the fragility of existing standard RL-based policies by designing RL algorithms to construct robust hybrid supervisors to eliminate chattering. Thrust 2 establishes the theoretical bedrock of a native hybrid RL formulation. By leveraging insights from discounted MPC, the hybrid RL problem is formulated with intrinsic closed-loop stability, safety, and robustness properties. Thrust 3 extends standard RL components to the hybrid domain to create RL algorithms capable of solving the hybrid RL problem defined in Thrust 2. Finally, Thrust 4 provides comprehensive empirical validation, confirming the robustness of the supervisors from Thrust 1 and demonstrating the advantages of the native hybrid RL formulation developed in Thrusts 2 and 3 over a standard RL formulation.
Host: Jan de Priester, Ph.D. Student, Electrical and Computer Engineering
Advisor: Ricardo Sanfelice
Zoom- https://ucsc.zoom.us/j/95229790206?pwd=ICevzd4QdEE7ZAlYALZIYbhU2bCU4W.1
Passcode- 981137