Data pipeline optimization : a case study on the maritime domain
Βελτιστοποίηση αγωγών δεδομένων : μια έρευνα στο ναυτιλιακό τομέα

Bachelor Dissertation
Συγγραφέας
Segkos, Nikolaos
Σέγκος, Νικόλαος
Ημερομηνία
2025-09Επιβλέπων
Theodoridis, YannisΘεοδωρίδης, Ιωάννης
Προβολή/ Άνοιγμα
Λέξεις κλειδιά
Data ingestion ; Scalable pipelines ; Maritime domain ; Apache Kafka ; AIS DataΠερίληψη
Data pipelines play an increasingly important role in the modern computing ecosystem. Across all sizes, complexities, and use‑cases, they form the backbone of any system that processes data. However, the rate at which data is ingested now outpaces the capabilities of vertically scaled architectures, leading to increased computing costs and potential disruptions to business continuity. Frameworks such as Apache Kafka, a distributed data streaming platform that uses the publish-subscribe model, have gained popularity for addressing these challenges by providing fault tolerance, high throughput, and horizontal scaling.
In this thesis, we analyze the data ingestion workflow of the AIS Antenna of University of Piraeus, with the goal of identifying performance bottlenecks and propose architectural optimizations. Database polling was found to be the primary cause of high end‑to‑end latency, with message loss exceeding 70%. To solve these problems, we propose an efficient architecture that employs Apache Kafka as the message transmission backbone and adds a Redis cache layer for rapid data initialization. Our experimental study over large real-world dataset streams shows that our proposed workflow effectively eliminates message loss and reduces end‑to‑end message latency by 98.9%, enabling horizontal scalability as well as real‑time visualization of maritime traffic.


