Traditional Machine Learning and Deep Learning techniques (data acquisition, preparation, model training and evaluation) take a lot of computational resources and time to produce even a simple prediction model, especially when implemented on a single machine. Intuitively, the demand for computational requirements is higher in case of management of Big Data and training of complex models. Thus, a paradigm shift from a single machine to a BD-oriented approach is required for making traditional Machine Learning and Deep Learning techniques fit to Big Data. In particular, it emerges the need for developing and deploying Big Data Analytics Infrastructures on cluster of machines. In this context, main features and principles of Distributed Deep Learning frameworks are here discussed. The main contribution of this paper is a systematic review of proposed solutions, aimed at investigating under a unifying lens their foundational elements, functional features and capabilities, despite the inherent literature fragmentation. To this, we conducted a literature search in Scopus and Google Scholar. This review also compares Distributed Deep Learning approaches according to more technical facets: implemented of parallelism techniques, supported hardware, model parameters sharing modalities, computation modalities for stochastic gradient descent and compatibility with other frameworks.
A Systematic Review of Distributed Deep Learning Frameworks for Big Data / Berloco, Francesco; Bevilacqua, Vitoantonio; Colucci, Simona. - STAMPA. - 13395:(2022), pp. 242-256. (Intervento presentato al convegno 18th International Conference on Intelligent Computing, ICIC 2022 tenutosi a Xi'an, China nel August 7-11, 2022) [10.1007/978-3-031-13832-4_21].
A Systematic Review of Distributed Deep Learning Frameworks for Big Data
Francesco Berloco;Vitoantonio Bevilacqua;Simona Colucci
2022-01-01
Abstract
Traditional Machine Learning and Deep Learning techniques (data acquisition, preparation, model training and evaluation) take a lot of computational resources and time to produce even a simple prediction model, especially when implemented on a single machine. Intuitively, the demand for computational requirements is higher in case of management of Big Data and training of complex models. Thus, a paradigm shift from a single machine to a BD-oriented approach is required for making traditional Machine Learning and Deep Learning techniques fit to Big Data. In particular, it emerges the need for developing and deploying Big Data Analytics Infrastructures on cluster of machines. In this context, main features and principles of Distributed Deep Learning frameworks are here discussed. The main contribution of this paper is a systematic review of proposed solutions, aimed at investigating under a unifying lens their foundational elements, functional features and capabilities, despite the inherent literature fragmentation. To this, we conducted a literature search in Scopus and Google Scholar. This review also compares Distributed Deep Learning approaches according to more technical facets: implemented of parallelism techniques, supported hardware, model parameters sharing modalities, computation modalities for stochastic gradient descent and compatibility with other frameworks.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.