Σύγκριση προβλεπτικών προσεγγίσεων και αλγορίθμων για το μονοξείδιο του άνθρακα

View/ Open
Keywords
Προβλέψεις ; Ποιότητα αέρα ; Single step ; Χρονοσειρές ; Multi stepAbstract
This MSc dissertation focuses on the application of forecasting algorithms to air quality data. Specifically, the study uses the dataset known as the “UCI Air Quality” dataset, available from the UCI repository, and focuses on forecasting carbon monoxide (CO). Air quality data can be considered difficult to predict, as they often exhibit unstable variability and extreme values over irregular intervals. In addition, the specific dataset contains a significant number of missing values, making it challenging to use, as appropriate preprocessing is required before applying forecasting algorithms. For this reason, particular emphasis is placed on the detection and handling of missing values, through the application and evaluation of different imputation techniques. Beyond this, the study presents four levels of comparisons regarding the forecasting algorithms applied: the comparison of statistical models (SARMA – SARIMA) with machine learning models (Random Forest), the comparison of “multi-step” and “single-step” forecasting approaches for these models, which leads to the comparison of “short-term” and “medium-term” forecasting horizons, as well as the comparison between “univariate” and “multivariate” modeling approaches. The forecasting horizon is defined as one week, where “single-step” approaches focus on predicting one hour ahead iteratively over a one-week period, while “multi-step” approaches focus on directly predicting values for the entire one-week horizon.
Regarding the results, model performance is primarily evaluated using the Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). In general, it is confirmed that single-step approaches outperform multi-step approaches. More specifically, Random Forest models produce the lowest error metrics. The univariate Random Forest model, using a single-step approach, achieves the best performance, with an MAE of 0,34 and an RMSE of 0,53.


