The unstructured nature of Real-World (RW) data from onco-hematological patients and the scarce accessibility to integrated systems restrain the use of RW information for research purposes. Natural Language Processing (NLP) might help in transposing unstructured reports into standardized electronic health records. We exploited NLP to develop an automated tool, named ARGO (Automatic Record Generator for Onco-hematology) to recognize information from pathology reports and populate electronic case report forms (eCRFs) pre-implemented by REDCap. ARGO was applied to hemo-lymphopathology reports of diffuse large B-cell, follicular, and mantle cell lymphomas, and assessed for accuracy (A), precision (P), recall (R) and F1-score (F) on internal (n = 239) and external (n = 93) report series. 326 (98.2%) reports were converted into corresponding eCRFs. Overall, ARGO showed high performance in capturing (1) identification report number (all metrics > 90%), (2) biopsy date (all metrics > 90% in both series), (3) specimen type (86.6% and 91.4% of A, 98.5% and 100.0% of P, 92.5% and 95.5% of F, and 87.2% and 91.4% of R for internal and external series, respectively), (4) diagnosis (100% of P with A, R and F of 90% in both series). We developed and validated a generalizable tool that generates structured eCRFs from real-life pathology reports.
Electronic case report forms generation from pathology reports by ARGO, automatic record generator for onco-hematology / Zaccaria, Gian Maria; Colella, Vito; Colucci, Simona; Clemente, Felice; Pavone, Fabio; Vegliante, Maria Carmela; Esposito, Flavia; Opinto, Giuseppina; Scattone, Anna; Loseto, Giacomo; Minoia, Carla; Rossini, Bernardo; Quinto, Angela Maria; Angiulli, Vito; Grieco, Luigi Alfredo; Fama, Angelo; Ferrero, Simone; Moia, Riccardo; Di Rocco, Alice; Quaglia, Francesca Maria; Tabanelli, Valentina; Guarini, Attilio; Ciavarella, Sabino. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - ELETTRONICO. - 11:1(2021). [10.1038/s41598-021-03204-z]
Electronic case report forms generation from pathology reports by ARGO, automatic record generator for onco-hematology
Zaccaria, Gian Maria;Colucci, Simona;Grieco, Luigi Alfredo;
2021-01-01
Abstract
The unstructured nature of Real-World (RW) data from onco-hematological patients and the scarce accessibility to integrated systems restrain the use of RW information for research purposes. Natural Language Processing (NLP) might help in transposing unstructured reports into standardized electronic health records. We exploited NLP to develop an automated tool, named ARGO (Automatic Record Generator for Onco-hematology) to recognize information from pathology reports and populate electronic case report forms (eCRFs) pre-implemented by REDCap. ARGO was applied to hemo-lymphopathology reports of diffuse large B-cell, follicular, and mantle cell lymphomas, and assessed for accuracy (A), precision (P), recall (R) and F1-score (F) on internal (n = 239) and external (n = 93) report series. 326 (98.2%) reports were converted into corresponding eCRFs. Overall, ARGO showed high performance in capturing (1) identification report number (all metrics > 90%), (2) biopsy date (all metrics > 90% in both series), (3) specimen type (86.6% and 91.4% of A, 98.5% and 100.0% of P, 92.5% and 95.5% of F, and 87.2% and 91.4% of R for internal and external series, respectively), (4) diagnosis (100% of P with A, R and F of 90% in both series). We developed and validated a generalizable tool that generates structured eCRFs from real-life pathology reports.File | Dimensione | Formato | |
---|---|---|---|
2021_Electronic_case_report_forms_generation_from_pathology_reports_by_ARGO_pdfeditoriale.pdf
accesso aperto
Tipologia:
Versione editoriale
Licenza:
Creative commons
Dimensione
2.27 MB
Formato
Adobe PDF
|
2.27 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.