Text Mining is an important step of Knowledge Discovery process. It is used to extract hidden information from not-structured o semi-structured data. This aspect is fundamental because much of the Web information is semi-structured due to the nested structure of HTML code, much of the Web information is linked, much of the Web information is redundant. Web Text Mining helps whole knowledge mining process to mining, extraction and integration of useful data, information and knowledge from Web page contents. In this paper, we present a Web Text Mining process able to discover knowledge in a distributed and heterogeneous multi-organization environment. TheWeb Text Mining process is based on flexible architecture and is implemented by four steps able to examine web content and to extract useful hidden information through mining techniques. Our Web Text Mining prototype starts from the recovery of Web job offers in which, through a Text Mining process, useful information for fast classification of the same are drawn out, these information are, essentially, job offer place and skills.
A Web Text Mining Flexible Architecture / Castellano, M.; Mastronardi, G.; Aprile, A.; Tarricone, G.. - In: WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY. - ISSN 2010-376X. - 1:8(2007), pp. 79.516-79.523.
A Web Text Mining Flexible Architecture
M. Castellano;G. Mastronardi;
2007-01-01
Abstract
Text Mining is an important step of Knowledge Discovery process. It is used to extract hidden information from not-structured o semi-structured data. This aspect is fundamental because much of the Web information is semi-structured due to the nested structure of HTML code, much of the Web information is linked, much of the Web information is redundant. Web Text Mining helps whole knowledge mining process to mining, extraction and integration of useful data, information and knowledge from Web page contents. In this paper, we present a Web Text Mining process able to discover knowledge in a distributed and heterogeneous multi-organization environment. TheWeb Text Mining process is based on flexible architecture and is implemented by four steps able to examine web content and to extract useful hidden information through mining techniques. Our Web Text Mining prototype starts from the recovery of Web job offers in which, through a Text Mining process, useful information for fast classification of the same are drawn out, these information are, essentially, job offer place and skills.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.