POLITECNICO DI BARI - Catalogo dei prodotti della Ricerca

Large Language Models (LLMs) have become increasingly central to recommendation scenarios due to their remarkable natural language understanding and generation capabilities. Although significant research has explored the use of LLMs for various recommendation tasks, little effort has been dedicated to verifying whether they have memorized public recommendation dataset as part of their training data. This is undesirable because memorization reduces the generalizability of research findings, as benchmarking on memorized datasets does not guarantee generalization to unseen datasets. Furthermore, memorization can amplify biases, for example, some popular items may be recommended more frequently than others.

Do LLMs Memorize Recommendation Datasets? A Preliminary Study on MovieLens-1M / Di Palma, Dario; Antonio Merra, Felice; Sfilio, Maurizio; Anelli, Vito Walter; Narducci, Fedelucio; Di Noia, Tommaso. - ELETTRONICO. - (2025), pp. 2582-2586. ( 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2025 Padova July 13-18, 2025) [10.1145/3726302.3730178].

Do LLMs Memorize Recommendation Datasets? A Preliminary Study on MovieLens-1M

Dario Di Palma;Felice Antonio Merra;Maurizio Sfilio;Vito Walter Anelli;Fedelucio Narducci;Tommaso Di Noia

2025

Abstract

Large Language Models (LLMs) have become increasingly central to recommendation scenarios due to their remarkable natural language understanding and generation capabilities. Although significant research has explored the use of LLMs for various recommendation tasks, little effort has been dedicated to verifying whether they have memorized public recommendation dataset as part of their training data. This is undesirable because memorization reduces the generalizability of research findings, as benchmarking on memorized datasets does not guarantee generalization to unseen datasets. Furthermore, memorization can amplify biases, for example, some popular items may be recommended more frequently than others.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del convegno
	
				48th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2025
			
	Codice ISBN
	
				979-8-4007-1592-1
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3726302.3730178
			
	Citazione
	
				Do LLMs Memorize Recommendation Datasets? A Preliminary Study on MovieLens-1M / Di Palma, Dario; Antonio Merra, Felice; Sfilio, Maurizio; Anelli, Vito Walter; Narducci, Fedelucio; Di Noia, Tommaso. - ELETTRONICO. - (2025), pp. 2582-2586. ( 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2025 Padova July 13-18, 2025) [10.1145/3726302.3730178].
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2025_Do_LLMs_Memorize_Recommendation_Datasets?_A_Preliminary_Study_on_MovieLens-1M_pdfeditoriale.pdf accesso aperto Tipologia: Versione editoriale Licenza: Creative commons Dimensione 960.98 kB Formato Adobe PDF Visualizza/Apri	960.98 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11589/292384

Citazioni

6

0

social impact