Achieving safe and autonomous glycemic regula- tion for Type-1 Diabetes care is an urgent challenge. Although Deep Reinforcement Learning (DRL) emerged as a promising paradigm, practical deployment is hindered by the risk of uncontrolled hyperglycemia or hypoglycemia. This work adapts two safe DRL approaches in the context of automated insulin delivery. The first consists of a Lagrangian constrained Markov decision process that solves a primal–dual scheme with adaptive multipliers, thereby delivering constraint satisfaction in expec- tation; the second adopts a Barrier–Lyapunov Actor–Critic framework that embeds discrete-time control-barrier conditions and Lyapunov decrease into the learning updates, ensuring step- wise feasibility and promoting stability by design. Simulations under randomized meal timing and size, benchmarked against a standard clinical practice protocol and an unconstrained DRL baseline, indicate improved time-in-range with reduced hypoglycemic events.
Safe Deep Reinforcement Learning Control of Type 1 Diabetes / Baldisseri, Federico; Lops, Giada; Mahdy Helmy Atanasious, Mohab; Menegatti, Danilo; Becchetti, Valentina; Delli Priscoli, Francesco; Mascolo, Saverio; Racanelli, Vito Andrea; Wrona, Andrea. - (2026). ( European Control Conference (ECC) 2026 Reykjavík, Iceland July 7-10 2026).
Safe Deep Reinforcement Learning Control of Type 1 Diabetes
Giada Lops;Francesco Delli Priscoli;Saverio Mascolo;Vito Andrea Racanelli
;
2026
Abstract
Achieving safe and autonomous glycemic regula- tion for Type-1 Diabetes care is an urgent challenge. Although Deep Reinforcement Learning (DRL) emerged as a promising paradigm, practical deployment is hindered by the risk of uncontrolled hyperglycemia or hypoglycemia. This work adapts two safe DRL approaches in the context of automated insulin delivery. The first consists of a Lagrangian constrained Markov decision process that solves a primal–dual scheme with adaptive multipliers, thereby delivering constraint satisfaction in expec- tation; the second adopts a Barrier–Lyapunov Actor–Critic framework that embeds discrete-time control-barrier conditions and Lyapunov decrease into the learning updates, ensuring step- wise feasibility and promoting stability by design. Simulations under randomized meal timing and size, benchmarked against a standard clinical practice protocol and an unconstrained DRL baseline, indicate improved time-in-range with reduced hypoglycemic events.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

