POLITECNICO DI BARI - Catalogo dei prodotti della Ricerca

This work takes a critical stance on previous studies concerning fairness evaluation in Large-Language Model (LLM)-based recommender systems, which have primarily assessed consumer fairness by comparing recommendation lists generated with and without sensitive user attributes. Such approaches implicitly treat discrepancies in recommended items as biases, overlooking whether these changes might stem from genuine personalization aligned with true preferences of users. Moreover, these earlier studies typically address single sensitive attributes in isolation, neglecting the complex interplay of intersectional identities. In response to these shortcomings, we introduce CFaiRLLM, an enhanced evaluation framework that not only incorporates true preference alignment but also rigorously examines intersectional fairness by considering overlapping sensitive attributes. Additionally, CFaiRLLM introduces diverse user profile sampling strategies—random, top-rated, and recency-focused—to better understand the impact of profile generation fed to LLMs in light of inherent token limitations in these systems. Given that fairness depends on accurately understanding users’ tastes and preferences, these strategies provide a more realistic assessment of fairness within RecLLMs. To validate the efficacy of CFaiRLLM, we conducted extensive experiments using MovieLens and LastFM datasets, applying various sampling strategies and sensitive attribute configurations. The evaluation metrics include both item similarity measures and true preference alignment considering both hit and ranking (Jaccard Similarity and PRAG), thereby conducting a multi-faceted analysis of recommendation fairness. The results demonstrated that true preference alignment offers a more personalized and fair assessment compared to similarity-based measures, revealing significant disparities when sensitive and intersectional attributes are incorporated. Notably, our study finds that intersectional attributes amplify fairness gaps more prominently, especially in less structured domains such as music recommendations in LastFM. These findings suggest that future fairness evaluations in RecLLMs should incorporate true preference alignment to ensure equitable and genuinely personalized recommendations.

CFaiRLLM: Consumer Fairness Evaluation in Large-Language Model Recommender System / Deldjoo, Y., Noia, T.D.. - In: ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY. - ISSN 2157-6904. - 16:6(2025), pp. 142.1-142.31. [10.1145/3725853]

CFaiRLLM: Consumer Fairness Evaluation in Large-Language Model Recommender System

Deldjoo, Yashar;Noia, Tommaso Di

2025

Abstract

This work takes a critical stance on previous studies concerning fairness evaluation in Large-Language Model (LLM)-based recommender systems, which have primarily assessed consumer fairness by comparing recommendation lists generated with and without sensitive user attributes. Such approaches implicitly treat discrepancies in recommended items as biases, overlooking whether these changes might stem from genuine personalization aligned with true preferences of users. Moreover, these earlier studies typically address single sensitive attributes in isolation, neglecting the complex interplay of intersectional identities. In response to these shortcomings, we introduce CFaiRLLM, an enhanced evaluation framework that not only incorporates true preference alignment but also rigorously examines intersectional fairness by considering overlapping sensitive attributes. Additionally, CFaiRLLM introduces diverse user profile sampling strategies—random, top-rated, and recency-focused—to better understand the impact of profile generation fed to LLMs in light of inherent token limitations in these systems. Given that fairness depends on accurately understanding users’ tastes and preferences, these strategies provide a more realistic assessment of fairness within RecLLMs. To validate the efficacy of CFaiRLLM, we conducted extensive experiments using MovieLens and LastFM datasets, applying various sampling strategies and sensitive attribute configurations. The evaluation metrics include both item similarity measures and true preference alignment considering both hit and ranking (Jaccard Similarity and PRAG), thereby conducting a multi-faceted analysis of recommendation fairness. The results demonstrated that true preference alignment offers a more personalized and fair assessment compared to similarity-based measures, revealing significant disparities when sensitive and intersectional attributes are incorporated. Notably, our study finds that intersectional attributes amplify fairness gaps more prominently, especially in less structured domains such as music recommendations in LastFM. These findings suggest that future fairness evaluations in RecLLMs should incorporate true preference alignment to ensure equitable and genuinely personalized recommendations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Rivista
	
				ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3725853
			
	Citazione
	
				CFaiRLLM: Consumer Fairness Evaluation in Large-Language Model Recommender System / Deldjoo, Y., Noia, T.D.. - In: ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY. - ISSN 2157-6904. - 16:6(2025), pp. 142.1-142.31. [10.1145/3725853]
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11589/302480

Citazioni

18

7

social impact