POLITECNICO DI BARI - Catalogo dei prodotti della Ricerca

This paper presents a switching control strategy as a criterion for policy selection in stochastic Dynamic Programming problems over an infinite time horizon. In particular, the Bellman operator, applied iteratively to solve such problems, is generalized to the case of stochastic policies, and formulated as a discrete-time switched affine system. Then, a Lyapunov-based policy selection strategy is designed to ensure the practical convergence of the resulting closed-loop system trajectories towards an appropriately chosen reference value function. This way, it is possible to verify how the chosen reference value function can be approached by using a stabilizing switching signal, the latter defined on a given finite set of stationary stochastic policies. Finally, the presented method is applied to the Value Iteration algorithm, and an illustrative example of a recycling robot is provided to demonstrate its effectiveness in terms of convergence

A switching control strategy for policy selection in stochastic Dynamic Programming problems / Tipaldi, Massimo; Iervolino, Raffaele; Massenio, Paolo Roberto; Naso, David. - In: AUTOMATICA. - ISSN 0005-1098. - 171:(2024). [10.1016/j.automatica.2024.111884]

A switching control strategy for policy selection in stochastic Dynamic Programming problems

Tipaldi, Massimo;Iervolino, Raffaele;Massenio, Paolo Roberto;Naso, David

2024

Abstract

This paper presents a switching control strategy as a criterion for policy selection in stochastic Dynamic Programming problems over an infinite time horizon. In particular, the Bellman operator, applied iteratively to solve such problems, is generalized to the case of stochastic policies, and formulated as a discrete-time switched affine system. Then, a Lyapunov-based policy selection strategy is designed to ensure the practical convergence of the resulting closed-loop system trajectories towards an appropriately chosen reference value function. This way, it is possible to verify how the chosen reference value function can be approached by using a stabilizing switching signal, the latter defined on a given finite set of stationary stochastic policies. Finally, the presented method is applied to the Value Iteration algorithm, and an illustrative example of a recycling robot is provided to demonstrate its effectiveness in terms of convergence

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Rivista
	
				AUTOMATICA
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.automatica.2024.111884
			
	Citazione
	
				A switching control strategy for policy selection in stochastic Dynamic Programming problems / Tipaldi, Massimo; Iervolino, Raffaele; Massenio, Paolo Roberto; Naso, David. - In: AUTOMATICA. - ISSN 0005-1098. - 171:(2024). [10.1016/j.automatica.2024.111884]
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11589/274483

Citazioni

3

2

social impact