Κατανεμημένος υπολογισμός πινάκων αφετηρίας - προορισμού με χρήση της τεχνολογίας mapreduce
Distributed computation of origin - destination matrices using the mapreduce technology
The purpose of the present work is the creation of an experimental distributed algorithm for the process of large volume of space-time data. The main goal is to generate origin-destination matrices through the utilization of the mapreduce programming paradigm and the Hadoop program. The ultimate objective is the installation and activation of the above algorithm in a small computer cluster of the infoLab laboratory so as, to produce performance results and evaluate them. More specifically transport route coordinates samples, are given in polar form (geospatial data) from users, unsorted in raw format in large-scale text files. The first contribution of our work is the correct representation of the trajectories based on optimum spatiotemporal grid and the subsequent creation of the appropriate origin destination (O-D) matrices. The second contribution of our work is associated with the proper introduction of mapreduce technology to the management of the above O-D matrices. More specifically, a three-step procedure is followed in which, starting from maps (1 mapper per cell) or trajectories (1 trajectory per cell) of the grid; the outcome is the creation of reduced O-D matrices that include all the necessary information for further processing by a distributed system using Hadoop technology.