Trajectory analysis of moving vehicles in real time
Ανάλυση τροχιάς κινούμενων οχημάτων σε πραγματικό χρόνο
Master Thesis
Author
Γιαννόπουλος, Κωνσταντίνος
Giannopoulos, Konstantinos
Date
2022View/ Open
Abstract
The ever-increasing production of spatiotemporal data, the increasingly velocity of their production as well as the development of real-time streaming data analysis systems, have raised new challenges in the field of data analysis. The analysis of trajectory data of moving objects is crucial for companies and organizations, which possess, and therefore need to effectively manage, fleets of vehicles. Efficient fleet management is based on quick and effective decision making. Thus, real-time and effective analysis of GPS emitted data is of vital importance.
In this Diploma Thesis, we propose a system that processes the GPS emitted data in real-time and creates a concise and explanatory report of each vehicle’s trips. One of the major challenges that real-time streaming data analysis systems are facing, is the effective processing of the delayed, out-of-order data. Our proposed implementation addresses this issue since it effectively detects and processes data that reaches the system in wrong chronological order.
The implementation of the proposed system is based on the use of scalable technologies. Apache Kafka was adopted as storage layer for the GPS emitted data. Processing of stored data is undertaken by Apache Flink, which is capable of distributed processing bounded and unbounded data streams, with high throughput, low latency and in a fault-tolerant way. The processed data are then stored into Elasticsearch, which allows fast search and retrieval of the required statistics. The last step of the proposed implementation is the visualization of the stored statistics. Kibana is used to create all the necessary dashboards, providing the end user with a high overview of the results.
This thesis is structured as follows: First, the theoretical foundations of big data analysis for both streaming and bounded data are presented. Then, the most popular real-time stream processing frameworks are presented and compared. Moreover, we explain in detail the proposed architecture, how our application is processing data that reach the system in chronological order and how it handles the delayed data. Finally, we present the performance of the implemented system.