The unstructured nature of Real-World (RW) data from onco-hematological patients and the scarce accessibility to integrated systems restrain the use of RW information for research purposes. Natural Language Processing (NLP) might help in transposing unstructured reports into standardized electronic health records. We exploited NLP to develop an automated tool, named ARGO (Automatic Record Generator for Onco-hematology) to recognize information from pathology reports and populate electronic case report forms (eCRFs) pre-implemented by REDCap. ARGO was applied to hemo-lymphopathology reports of diffuse large B-cell, follicular, and mantle cell lymphomas, and assessed for accuracy (A), precision (P), recall (R) and F1-score (F) on internal (n = 239) and external (n = 93) report series. 326 (98.2%) reports were converted into corresponding eCRFs. Overall, ARGO showed high performance in capturing (1) identification report number (all metrics > 90%), (2) biopsy date (all metrics > 90% in both series), (3) specimen type (86.6% and 91.4% of A, 98.5% and 100.0% of P, 92.5% and 95.5% of F, and 87.2% and 91.4% of R for internal and external series, respectively), (4) diagnosis (100% of P with A, R and F of 90% in both series). We developed and validated a generalizable tool that generates structured eCRFs from real-life pathology reports.

Electronic case report forms generation from pathology reports by ARGO, automatic record generator for onco-hematology / Zaccaria, Gian Maria; Colella, Vito; Colucci, Simona; Clemente, Felice; Pavone, Fabio; Vegliante, Maria Carmela; Esposito, Flavia; Opinto, Giuseppina; Scattone, Anna; Loseto, Giacomo; Minoia, Carla; Rossini, Bernardo; Quinto, Angela Maria; Angiulli, Vito; Grieco, Luigi Alfredo; Fama, Angelo; Ferrero, Simone; Moia, Riccardo; Di Rocco, Alice; Quaglia, Francesca Maria; Tabanelli, Valentina; Guarini, Attilio; Ciavarella, Sabino. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - ELETTRONICO. - 11:1(2021). [10.1038/s41598-021-03204-z]

Electronic case report forms generation from pathology reports by ARGO, automatic record generator for onco-hematology

Zaccaria, Gian Maria;Colucci, Simona;Grieco, Luigi Alfredo;
2021-01-01

Abstract

The unstructured nature of Real-World (RW) data from onco-hematological patients and the scarce accessibility to integrated systems restrain the use of RW information for research purposes. Natural Language Processing (NLP) might help in transposing unstructured reports into standardized electronic health records. We exploited NLP to develop an automated tool, named ARGO (Automatic Record Generator for Onco-hematology) to recognize information from pathology reports and populate electronic case report forms (eCRFs) pre-implemented by REDCap. ARGO was applied to hemo-lymphopathology reports of diffuse large B-cell, follicular, and mantle cell lymphomas, and assessed for accuracy (A), precision (P), recall (R) and F1-score (F) on internal (n = 239) and external (n = 93) report series. 326 (98.2%) reports were converted into corresponding eCRFs. Overall, ARGO showed high performance in capturing (1) identification report number (all metrics > 90%), (2) biopsy date (all metrics > 90% in both series), (3) specimen type (86.6% and 91.4% of A, 98.5% and 100.0% of P, 92.5% and 95.5% of F, and 87.2% and 91.4% of R for internal and external series, respectively), (4) diagnosis (100% of P with A, R and F of 90% in both series). We developed and validated a generalizable tool that generates structured eCRFs from real-life pathology reports.
2021
Electronic case report forms generation from pathology reports by ARGO, automatic record generator for onco-hematology / Zaccaria, Gian Maria; Colella, Vito; Colucci, Simona; Clemente, Felice; Pavone, Fabio; Vegliante, Maria Carmela; Esposito, Flavia; Opinto, Giuseppina; Scattone, Anna; Loseto, Giacomo; Minoia, Carla; Rossini, Bernardo; Quinto, Angela Maria; Angiulli, Vito; Grieco, Luigi Alfredo; Fama, Angelo; Ferrero, Simone; Moia, Riccardo; Di Rocco, Alice; Quaglia, Francesca Maria; Tabanelli, Valentina; Guarini, Attilio; Ciavarella, Sabino. - In: SCIENTIFIC REPORTS. - ISSN 2045-2322. - ELETTRONICO. - 11:1(2021). [10.1038/s41598-021-03204-z]
File in questo prodotto:
File Dimensione Formato  
2021_Electronic_case_report_forms_generation_from_pathology_reports_by_ARGO_pdfeditoriale.pdf

accesso aperto

Tipologia: Versione editoriale
Licenza: Creative commons
Dimensione 2.27 MB
Formato Adobe PDF
2.27 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11589/238581
Citazioni
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 1
social impact