POLITECNICO DI BARI - Catalogo dei prodotti della Ricerca

Automated insulin delivery (AID) requires con- trollers that are both adaptive and safe. This study proposes a hybrid control framework that combines a reinforcement learning (RL) agent with a language-guided advisory layer in the SimGlucose simulator of the FDA-approved UVA/Padova type 1 diabetes model. A Proximal Policy Optimization (PPO) agent learns insulin dosing via an asymmetric, safety-weighted reward, while a fine-tuned Falcon-RW-1B model provides guideline-consistent recommendations. A supervisory fusion rule merges both outputs according to policy uncertainty and medical constraints (suspension < 90 mg/dL, recovery cap > 70 mg/dL, rate limit 0.03 U/min). Across ten virtual adults and twenty stochastic meal scenarios, the hybrid RL+LLM controller achieved a time-in-range of 86%±7.3% and reduced exposure to hypoglycemia compared with the considered base- lines, while maintaining total insulin delivery within a limited deviation from reference levels. A quasi-counterfactual analysis indicated strong alignment between rule activations and action changes (fidelity ≈1.0, validity ≥90%). These results suggest that hybrid RL–LLM architectures are a promising direction for safe and adaptive closed-loop insulin control.

Bridging Clinical Knowledge and Reinforcement Learning in Automated Insulin Delivery: An LLM-in-the-Loop Approach / Lops, Giada; Ramdan, Taha; Racanelli, Vito Andrea; De Cicco, Luca; Mascolo, Saverio. - (2026). ( European Control Conference (ECC) 2026 Reykjavík, Iceland July 7-10 2026).

Bridging Clinical Knowledge and Reinforcement Learning in Automated Insulin Delivery: An LLM-in-the-Loop Approach

Giada Lops;Taha Ramdan;Vito Andrea Racanelli;Luca De Cicco;Saverio Mascolo

2026

Abstract

Automated insulin delivery (AID) requires con- trollers that are both adaptive and safe. This study proposes a hybrid control framework that combines a reinforcement learning (RL) agent with a language-guided advisory layer in the SimGlucose simulator of the FDA-approved UVA/Padova type 1 diabetes model. A Proximal Policy Optimization (PPO) agent learns insulin dosing via an asymmetric, safety-weighted reward, while a fine-tuned Falcon-RW-1B model provides guideline-consistent recommendations. A supervisory fusion rule merges both outputs according to policy uncertainty and medical constraints (suspension < 90 mg/dL, recovery cap > 70 mg/dL, rate limit 0.03 U/min). Across ten virtual adults and twenty stochastic meal scenarios, the hybrid RL+LLM controller achieved a time-in-range of 86%±7.3% and reduced exposure to hypoglycemia compared with the considered base- lines, while maintaining total insulin delivery within a limited deviation from reference levels. A quasi-counterfactual analysis indicated strong alignment between rule activations and action changes (fidelity ≈1.0, validity ≥90%). These results suggest that hybrid RL–LLM architectures are a promising direction for safe and adaptive closed-loop insulin control.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Titolo del convegno
	
				European Control Conference (ECC) 2026
			
	Citazione
	
				Bridging Clinical Knowledge and Reinforcement Learning in Automated Insulin Delivery: An LLM-in-the-Loop Approach / Lops, Giada; Ramdan, Taha; Racanelli, Vito Andrea; De Cicco, Luca; Mascolo, Saverio. - (2026). ( European Control Conference (ECC) 2026 Reykjavík, Iceland July 7-10 2026).
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11589/300341

Citazioni

ND

ND

social impact