Recommender systems are emerging as an interesting application scenario for Linked Data (LD). In fact, by exploiting the knowledge encoded in LD datasets, a new generation of semantics-aware recommendation engines have been developed in the last years. As Linked Data is often very rich and contains many information that may result irrelevant and noisy for a recommendation task, an initial step of feature selection is always required in order to select the most meaningful portion of the original dataset. Many approaches have been proposed in the literature for feature selection that exploit different statistical dimensions of the original data. In this paper we investigate the role of the semantics encoded in an ontological hierarchy when exploited to select the most relevant properties for a recommendation task. In particular, we compare an approach based on schema summarization with a ``classical'' one, i.e., Information Gain. We evaluated the performance of the two methods in terms of accuracy and aggregate diversity by setting up an experimental testbed relying on the Movielens dataset.

Schema-summarization in Linked-Data-based feature selection for recommender systems

Azzurra Ragone;Corrado Magarelli;Tommaso Di Noia;Eugenio Di Sciascio
2017-01-01

Abstract

Recommender systems are emerging as an interesting application scenario for Linked Data (LD). In fact, by exploiting the knowledge encoded in LD datasets, a new generation of semantics-aware recommendation engines have been developed in the last years. As Linked Data is often very rich and contains many information that may result irrelevant and noisy for a recommendation task, an initial step of feature selection is always required in order to select the most meaningful portion of the original dataset. Many approaches have been proposed in the literature for feature selection that exploit different statistical dimensions of the original data. In this paper we investigate the role of the semantics encoded in an ontological hierarchy when exploited to select the most relevant properties for a recommendation task. In particular, we compare an approach based on schema summarization with a ``classical'' one, i.e., Information Gain. We evaluated the performance of the two methods in terms of accuracy and aggregate diversity by setting up an experimental testbed relying on the Movielens dataset.
32nd Annual ACM Symposium on Applied Computing, SAC 2017
978-1-4503-4486-9
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11589/89463
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • Scopus 20
  • ???jsp.display-item.citation.isi??? ND
social impact