The primary purpose of this thesis is to present several pipelines for developing multimodal Decision Support Systems that leverage omics and healthcare Big Data analytics, contributing to the advancement in precision medicine field. Healthcare Big Data are analyzed using Machine Learning and Deep Learning models which are implemented in prototypal form, known as biomedical Decision Support Systems, across different healthcare domains such as medical image analysis, bioinformatics, natural language processing and survival analysis. Deep Learning models play a crucial role in medical imaging and bioinformatics fields. In the first one, Deep Learning models find application in extracting features from medical images and making prediction about diseases status or genetic mutations. Within the bioinformatics field, Deep Learning plays a pivotal role in extracting actionable insights from omics data clusters, facilitating a deeper understanding of biological systems (e.g., a patient). Such kinds of data are heterogeneous and generated in a large number, during constants time periods. Concerning survival analysis, Machine Learning and Deep Learning are widely used for assessing and categorizing the severity of pathologies over time, aiding personalized treatment strategies. Notably, most of medical and clinical examinations are provided with free-text reports; Machine Learning and Deep Learning can be exploited for extracting useful information from them, in the context of natural language processing. In such scenarios, this thesis objective is to develop and validate several pipelines for heterogeneous healthcare Big Data analytics. Specifically, two sets of multimodal and unimodal pipelines are presented. The former includes the multimodal pipelines that integrate medical imaging data with omics to study Pancreatic Ductal Adenocarcinoma disease from different perspectives. The latter includes pipelines for medical image classification, survival analysis, and natural language processing in different use cases. Technical contributions of this work include designing novel algorithms, improving existing workflows, designing multimodal algorithms for analyzing heterogeneous data coming from different sources and incorporating Explainable Artificial Intelligence algorithms for interpreting the decision of investigation models. In order to develop and validate the proposed pipelines, several heterogeneous case studies have been examined, using either public or private datasets. Regarding the multimodal pipelines, proposed applications focus on pancreatic cancer, including: (i) multi-omics analysis (Radiomics, Genomics and clinical) for overall survival and recurrence prediction; (ii) multimodal analyis based on pathomics and transcrittomics for gene mutation prediction. In unimodal analysis pipelines, proposed applications include: (i) enhancing model selection in survival analysis using time-dependent explainability algorithms for Obstructive Sleep Apnea; (ii) Deep Learning approaches for medical image classification for IgA nephropathy; (iii) shape based breast lesion classification using digital tomosynthesis images; (iv) diagnosis standardization from free-text reports.
Multimodal approaches in healthcare Big Data Analytics for precision medicine / Berloco, Francesco. - ELETTRONICO. - (2024).
Multimodal approaches in healthcare Big Data Analytics for precision medicine
Berloco, Francesco
2024-01-01
Abstract
The primary purpose of this thesis is to present several pipelines for developing multimodal Decision Support Systems that leverage omics and healthcare Big Data analytics, contributing to the advancement in precision medicine field. Healthcare Big Data are analyzed using Machine Learning and Deep Learning models which are implemented in prototypal form, known as biomedical Decision Support Systems, across different healthcare domains such as medical image analysis, bioinformatics, natural language processing and survival analysis. Deep Learning models play a crucial role in medical imaging and bioinformatics fields. In the first one, Deep Learning models find application in extracting features from medical images and making prediction about diseases status or genetic mutations. Within the bioinformatics field, Deep Learning plays a pivotal role in extracting actionable insights from omics data clusters, facilitating a deeper understanding of biological systems (e.g., a patient). Such kinds of data are heterogeneous and generated in a large number, during constants time periods. Concerning survival analysis, Machine Learning and Deep Learning are widely used for assessing and categorizing the severity of pathologies over time, aiding personalized treatment strategies. Notably, most of medical and clinical examinations are provided with free-text reports; Machine Learning and Deep Learning can be exploited for extracting useful information from them, in the context of natural language processing. In such scenarios, this thesis objective is to develop and validate several pipelines for heterogeneous healthcare Big Data analytics. Specifically, two sets of multimodal and unimodal pipelines are presented. The former includes the multimodal pipelines that integrate medical imaging data with omics to study Pancreatic Ductal Adenocarcinoma disease from different perspectives. The latter includes pipelines for medical image classification, survival analysis, and natural language processing in different use cases. Technical contributions of this work include designing novel algorithms, improving existing workflows, designing multimodal algorithms for analyzing heterogeneous data coming from different sources and incorporating Explainable Artificial Intelligence algorithms for interpreting the decision of investigation models. In order to develop and validate the proposed pipelines, several heterogeneous case studies have been examined, using either public or private datasets. Regarding the multimodal pipelines, proposed applications focus on pancreatic cancer, including: (i) multi-omics analysis (Radiomics, Genomics and clinical) for overall survival and recurrence prediction; (ii) multimodal analyis based on pathomics and transcrittomics for gene mutation prediction. In unimodal analysis pipelines, proposed applications include: (i) enhancing model selection in survival analysis using time-dependent explainability algorithms for Obstructive Sleep Apnea; (ii) Deep Learning approaches for medical image classification for IgA nephropathy; (iii) shape based breast lesion classification using digital tomosynthesis images; (iv) diagnosis standardization from free-text reports.File | Dimensione | Formato | |
---|---|---|---|
37 ciclo-BERLOCO Francesco.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Tutti i diritti riservati
Dimensione
60.65 MB
Formato
Adobe PDF
|
60.65 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.