Ανάλυση δημοσίων διαγωνισμών μακράς διάρκειας για την ανίχνευση του χρονικού κινδύνου με χρήση τεχνικών μηχανικής μάθησης και επιχειρησιακή προσομοίωση - Λειτουργική αρχιτεκτονική έγκαιρης προειδοποίησης με Streamlit και FastAPI

View/ Open
Keywords
Δημόσιες συμβάσεις ; Μηχανική μάθηση ; Πρόβλεψη διάρκειας ; Προσομοίωση BPMN ; Σύστημα έγκαιρης προειδοποίησης ; Διαχείριση κινδύνου ; LightGBM ; FastAPI ; StreamlitAbstract
Title: «Analyzing Long-Duration Public Procurement Procedures for Temporal Risk Detection through Machine Learning and Business Process Simulation: An Operational Early-Warning Archi-tecture with Streamlit and FastAPI»
This Master’s Thesis investigates the temporal behavior of public procurement procedures during the pre-award stage, with a particular focus on the early identification and quantification of the risk of severe time delays. To this end, the study develops an integrated early-warning framework that combines Machine Learning techniques with BPMN-based business process simulation. The research is structured around four core research questions addressing: (i) the factors influencing the duration and delay risk of public procurement procedures at the pre-award stage, (ii) the po-tential of large-scale data to reliably distinguish between different temporal regimes under condi-tions of high heterogeneity, (iii) the operational superiority of a two-stage discriminative prediction architecture compared to single-stage regression approaches, and (iv) the transformation of pre-dictive outputs into operationally interpretable scenarios through process simulation.
The empirical analysis is based on a multi-country dataset comprising more than 8.6 million public procurement procedures from the period 2009–2024. The results show that procedure durations exhibit pronounced right skewness, heavy-tailed distributions, and recurring spikes at institutional-ly significant temporal thresholds). These findings indicate that evaluating predictive models solely using mean-error metrics is insufficient, as such metrics obscure systematic failures near critical time thresholds that are directly relevant to decision-making. To address this limitation, a two-stage discriminative prediction architecture is proposed, combining a first-stage classifier that es-timates the probability of exceeding 720 days with two specialized regressors for short and long durations, respectively. This approach is evaluated against single-stage regression models and is shown to be operationally superior in terms of temporal risk calibration. The final model achieves a Mean Absolute Error (MAE) of 229.6 days on a fully unseen test set (2023) and near-zero deviation in predicting the proportion of long procedures (ΔLong≥720 ≈ +0.1 percentage points), effectively eliminating the structural bias observed in single-stage approaches.
Finally, the predictive results are transformed into early-warning risk indicators, which are embed-ded in a BPMN-based simulation implemented in Bizagi, enabling the exploration of operationally interpretable scenarios. The analysis highlights significant systemic effects associated with high-risk temporal regimes, including a sharp decline in completion rates and a substantial increase in overall process cycle time. Overall, the thesis reframes duration prediction from a point-estimation mechanism into a functional early-warning tool, supporting evidence-based decision-making in public administration and business process management contexts.


