This work investigates long-horizon regulation of Type 2 Diabetes progression through daily physical activity using a data-driven controller based on Proximal Policy Optimization. A five-state physiological model (comprising glucose, insulin, β-cell mass, insulin sensitivity, and IL-6 dynamic) is embedded in a custom environment enabling closed-loop simulations over a two-year horizon. The framework introduces realistic variability through parameter and initial-condition perturbations (±5%), circadian glucose oscillations (±20 mg/dL), and mid-episode degradation of the insulin-sensitivity target, providing a physiologically consistent and challenging benchmark. The Proximal Policy Optimization agent learns adaptive daily exercise policies that preserve glucose homeostasis and robustness against uncertainty. Across a 200-patient evaluation cohort, the controller achieves a 66% success rate in maintaining final glucose levels below 126 mg/dL, demonstrating the feasibility of reinforcement learning for long-term, personalized physical activity regulation and its potential to support model-based digital therapeutics in Type 2 Diabetes management.
Data-Driven Control of Type 2 Diabetes Progression via Personalized Physical Activity / Lops, Giada; De Paola, Pierluigi Francesco; Racanelli, Vito Andrea; Manfredi, Gioacchino; De Cicco, Luca; Mascolo, Saverio. - (2026). ( IFAC World Congress 2026 Busan, South Korea 23-28 Agosto 2026).
Data-Driven Control of Type 2 Diabetes Progression via Personalized Physical Activity
Giada Lops
;Pierluigi Francesco De Paola;Vito Andrea Racanelli;Gioacchino Manfredi;Luca De Cicco;Saverio Mascolo
2026
Abstract
This work investigates long-horizon regulation of Type 2 Diabetes progression through daily physical activity using a data-driven controller based on Proximal Policy Optimization. A five-state physiological model (comprising glucose, insulin, β-cell mass, insulin sensitivity, and IL-6 dynamic) is embedded in a custom environment enabling closed-loop simulations over a two-year horizon. The framework introduces realistic variability through parameter and initial-condition perturbations (±5%), circadian glucose oscillations (±20 mg/dL), and mid-episode degradation of the insulin-sensitivity target, providing a physiologically consistent and challenging benchmark. The Proximal Policy Optimization agent learns adaptive daily exercise policies that preserve glucose homeostasis and robustness against uncertainty. Across a 200-patient evaluation cohort, the controller achieves a 66% success rate in maintaining final glucose levels below 126 mg/dL, demonstrating the feasibility of reinforcement learning for long-term, personalized physical activity regulation and its potential to support model-based digital therapeutics in Type 2 Diabetes management.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

