This paper illustrates an automatic document processing system for the extraction of data contained in medical laboratory results printed on paper. The final goal of the research is to automate the collection of medical data and to enable an efficient management and dissemination of the information. The following processing steps of the system are described in detail: image preprocessing; layout analysis for the identification of the tables contained in the document; extraction and classification of the laboratory results. Among the many features of the system there are the use of an open source OCR engine, as a basis of further processing, and the storage in XML format of the data retrieved, for ease of sharing. The knowledge base used to guide the data extraction is also explained. The proposed approach has been tested on several document formats and performance analyzed.
|Titolo:||An automatic document processing system for medical data extraction|
|Data di pubblicazione:||2015|
|Digital Object Identifier (DOI):||http://dx.doi.org/10.1016/j.measurement.2014.10.032|
|Appare nelle tipologie:||1.1 Articolo in rivista|