POLITECNICO DI BARI - Catalogo dei prodotti della Ricerca

Linked Open Data has been recognized as a valuable source for background information in many data mining and information retrieval tasks. However, most of the existing tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs. We generate sequences by leveraging local information from graph sub-structures, harvested by Weisfeiler–Lehman Subtree RDF Graph Kernels and graph walks, and learn latent numerical representations of entities in RDF graphs. We evaluate our approach on three different tasks: (i) standard machine learning tasks, (ii) entity and document modeling, and (iii) content-based recommender systems. The evaluation shows that the proposed entity embeddings outperform existing techniques, and that pre-computed feature vector representations of general knowledge graphs such as DBpedia and Wikidata can be easily reused for different tasks

RDF2Vec: RDF Graph Embeddings and Their Applications / Ristoski, Petar; Rosati, Jessica; Di Noia, Tommaso; De Leone, Renato; Paulheim, Heiko. - In: SEMANTIC WEB. - ISSN 1570-0844. - STAMPA. - 10:4(2019), pp. 721-752. [10.3233/SW-180317]

RDF2Vec: RDF Graph Embeddings and Their Applications

Petar Ristoski;Jessica Rosati^{Membro del Collaboration Group};Tommaso Di Noia;Renato De Leone;Heiko Paulheim

2019

Abstract

Linked Open Data has been recognized as a valuable source for background information in many data mining and information retrieval tasks. However, most of the existing tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs. We generate sequences by leveraging local information from graph sub-structures, harvested by Weisfeiler–Lehman Subtree RDF Graph Kernels and graph walks, and learn latent numerical representations of entities in RDF graphs. We evaluate our approach on three different tasks: (i) standard machine learning tasks, (ii) entity and document modeling, and (iii) content-based recommender systems. The evaluation shows that the proposed entity embeddings outperform existing techniques, and that pre-computed feature vector representations of general knowledge graphs such as DBpedia and Wikidata can be easily reused for different tasks

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Rivista
	
				SEMANTIC WEB
			
	Codice DOI
	
				https://dx.doi.org/10.3233/SW-180317
			
	Citazione
	
				RDF2Vec: RDF Graph Embeddings and Their Applications / Ristoski, Petar; Rosati, Jessica; Di Noia, Tommaso; De Leone, Renato; Paulheim, Heiko. - In: SEMANTIC WEB. - ISSN 1570-0844. - STAMPA. - 10:4(2019), pp. 721-752. [10.3233/SW-180317]
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
swj1738-4.pdf accesso aperto Descrizione: Accepted version Tipologia: Documento in Post-print Licenza: Tutti i diritti riservati Dimensione 932.34 kB Formato Adobe PDF Visualizza/Apri	932.34 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11589/203371

Citazioni

151

100

social impact