Traffic prediction on road networks
Πρόβλεψη κυκλοφορίας σε οδικά δίκτυα
![Thumbnail](/xmlui/bitstream/handle/unipi/17262/Traffic%20prediction%20on%20road%20networks%20-%20Afroditi%20Karatrantou.pdf.jpg?sequence=6&isAllowed=y)
Master Thesis
Author
Karatrantou, Afroditi
Καρατράντου, Αφροδίτη
Date
2024-11Advisor
Pelekis, NikolaosΠελέκης, Νικόλαος
View/ Open
Keywords
Python ; Urban transit systems ; Machine learning ; Estimated Time of Arrival (ETA) ; Random Forest (RF)Abstract
Accurate Estimated Time of Arrival (ETA) prediction is essential for optimizing urban transit systems and enhancing user experience. This thesis examines advanced machine learning techniques, including Neural Network (NN), Random Forest (RF), XGBoost, Long Short-Term Memory (LSTM) networks, and Graph Convolutional Networks (GCNs), for predicting ETAs of urban transit vehicles using real-time GPS data. A comprehensive data preprocessing pipeline was developed to handle noise and inconsistencies in transit data, involving steps like normalization, imputation of missing values, and k-fold Cross Validation. The results show that the Random Forest model, especially when optimized with k-fold Cross Validation, provides superior accuracy in terms of Mean Squared Error (MSE) and Mean Absolute Error (MAE), balancing prediction accuracy with computational efficiency. The LSTM model also produced promising results, effectively capturing temporal dependencies. In contrast, both GCNs and XGBoost models, produced poorer results with significantly lower accuracy, despite XGBoost being computationally efficient. The Simple Model (NN) performed well despite its simplicity, with the high number of neurons contributing to its strong predictive accuracy. Additionally, a SHAP (SHapley Additive exPlanations) analysis was performed to highlight the most important features influencing ETA predictions. Temporal features such as the unix time are the most influential. This thesis contributes to the field by providing an easy-to-understand method for predicting ETAs in urban transit. Future work could involve expanding the dataset and integrating additional features like traffic congestion and weather conditions to enhance model performance.