Massive quantities of data are today processed using parallel computing frameworks that parallelize computations on large distributed clusters consisting of many machines. Such frameworks are adopted in big data analytic tasks as recommender systems, social network analysis, legal investigation that involve iterative computations over large datasets. One of the most used framework is MapReduce, scalable and suitable for data-intensive processing with a parallel computation model characterized by sequential and parallel processing interleaving. Its open-source implementation -- Hadoop -- is adopted by many cloud infrastructures as Google, Yahoo, Amazon, Facebook. In this paper we propose a formal approach to model the MapReduce framework using model checking and temporal logics to verify properties of reliability and load balancing of the MapReduce job flow.
A computational model for Mapreduce job flow / Di Noia, Tommaso; Mongiello, Marina; Di Sciascio, Eugenio. - ELETTRONICO. - 1195:(2014), pp. 49-54. (Intervento presentato al convegno 29th Italian Conference on Computational Logic, CILC 2014 tenutosi a Torino, Italy nel June 16-18, 2014).
A computational model for Mapreduce job flow
Tommaso Di Noia;Marina Mongiello;Eugenio Di Sciascio
2014-01-01
Abstract
Massive quantities of data are today processed using parallel computing frameworks that parallelize computations on large distributed clusters consisting of many machines. Such frameworks are adopted in big data analytic tasks as recommender systems, social network analysis, legal investigation that involve iterative computations over large datasets. One of the most used framework is MapReduce, scalable and suitable for data-intensive processing with a parallel computation model characterized by sequential and parallel processing interleaving. Its open-source implementation -- Hadoop -- is adopted by many cloud infrastructures as Google, Yahoo, Amazon, Facebook. In this paper we propose a formal approach to model the MapReduce framework using model checking and temporal logics to verify properties of reliability and load balancing of the MapReduce job flow.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.