Προβλεπτική ανάλυση κόστους διαμονής και μεταφοράς και ομαδοποίηση ταξιδιωτών με χρήση μηχανικής μάθησης σε περιορισμένο σύνολο δεδομένων

View/ Open
Keywords
Τουρισμός ; Μηχανική μάθηση ; Πρόβλεψη κόστους ; Ομαδοποίηση ; Random Forest ; CatBoost ; K-ModesAbstract
Tourism is one of the most dynamic sectors of the global economy, with a decisive impact on development,
employment, and strategic planning. The increasing complexity of demand makes it necessary to utilize
modern computational methods for forecasting and understanding critical parameters. In this context,
Machine Learning offers capabilities for cost prediction, pattern detection, and the support of personalized
services.
This study is based on a small dataset (137 samples) from Kaggle’s “Traveler details dataset,” which
includes information such as destination, trip duration, traveler nationality, accommodation type, means
of transportation, and related costs. The limited size of the dataset presents a challenge, requiring careful
processing in order to extract reliable conclusions.
The research focuses on predicting accommodation and transportation costs and clustering travelers
based on demographic and travel-related characteristics. For cost prediction, the Random Forest Regressor
and CatBoost Regressor algorithms were applied, while for clustering the K-Modes technique was used.
The regression models achieved satisfactory accuracy in estimating accommodation and transportation
costs, while clustering revealed distinct traveler profiles based on nationality, age, accommodation type,
and travel seasonality.
The findings indicate that even small datasets can enable Machine Learning to provide valuable insights
into traveler behavior. Such insights can support the design of targeted marketing strategies and dynamic
pricing, enhancing the ability of tourism enterprises to understand demand and improve visitor
experience. Finally, the need for larger and more representative datasets is highlighted, in order to further
strengthen the reliability and practical application of the models in the tourism sector.

