Ανίχνευση επιθέσεων με αντίπαλη τεχνητή νοημοσύνη
Adversarial AI attack detection

View/ Open
Keywords
Adversarial AI attack detection ; FGSM ; PGD ; Deep learning detectors ; ANN CNN RNN detectors ; Unknown adversarial attacks ; Semi-supervised learning ; Lifelong learning ; Image-based adversarial attacksAbstract
In this thesis, the detection of adversarial attacks in deep learning systems used for image classification is investigated. Two different datasets were examined the LISA Traffic Light Dataset (traffic light recognition) and the FruitNet: Indian Fruits Quality Dataset (fruit quality recognition), in order to evaluate model performance both in-distribution and out-of-distribution. Initially, ANN, CNN, and RNN classifiers were trained, and subsequently FGSM and PGD adversarial attacks were generated. Based on these data, adversarial input detectors of corresponding architectures were developed, along with a semi-supervised and a lifelong CNN-based detector. The evaluations were conducted on clean and perturbed images, as well as on unknown attacks. The results showed that CNN models are the most stable and accurate both as classifiers and as detectors, maintaining high performance across both datasets. The semi-supervised detector improved generalization without additional labeling, while the lifelong learning detector achieved near-complete detection on unknown attacks, with only a small increase in false alarms. The consistency of the findings across the two datasets indicates that adversarial patterns exhibit common characteristics regardless of the domain, and that the proposed detectors can operate reliably in changing environments.


