Fraud detection in car insurance using unsupervised machine learning
Ανίχνευση απάτης στις ασφάλειες αυτοκινήτου με χρήση μη-εποπτευόμενης μηχανικής μάθησης
Master Thesis
Author
Michelakis, Charalampos - Panagiotis
Μιχελάκης, Χαράλαμπος - Παναγιώτης
Date
2024-06Advisor
Bersimis, SotirisΜπερσίμης, Σωτήριος
View/ Open
Keywords
Unsupervised machine learning ; Fraud detection ; Auto insuranceAbstract
The detection of fraud in automobile insurance holds significant economic and ethical implications. Studies suggest that fraudulent automobile insurance claims account for 10%-20% of total claims submitted in Central and Eastern Europe. We will explore the possibilities of leveraging unsupervised machine learning methods in tackling this problem. Notably, this research area remains relatively unexplored within the insurance fraud detection literature, which predominantly focuses on a limited set of unsupervised machine learning methods. Our work takes a much broader approach regarding the methods used, drawing inspiration from the more general and rapidly evolving domain of anomaly/outlier detection. Regarding the evaluation of these methods, it is conducted by means of a simulation study, as the scarcity of publicly available real -world data sets, due to their confidential nature, poses a significant challenge in researching automobile insurance fraud. The choice of a simulation study is our way of circumventing this “roadblock”. Our simulated data sets are the outcome of a “synthetic recon struction” of a real world data set, which is used as a “seed” for the generation of typical/non-fraudulent data samples which are then augmented by several dif ferent types of parametrically created synthetic outliers. The culmination of our work is the performance comparison of almost thirty different outlier detection algorithms across five different synthetic outlier scenarios, which could provide new insights for combating fraud in automobile insurance using unsupervised machine learning.