AI adversarial attack detection and mitigation for AI-based systems

Ziras, Georgios; Ζήρας, Γεώργιος

dc.contributor.advisor	Xenakis, Christos
dc.contributor.advisor	Ξενάκης, Χρήστος
dc.contributor.author	Ziras, Georgios
dc.contributor.author	Ζήρας, Γεώργιος
dc.date.accessioned	2025-04-08T08:42:22Z
dc.date.available	2025-04-08T08:42:22Z
dc.date.issued	2025-04-04
dc.identifier.uri	https://dione.lib.unipi.gr/xmlui/handle/unipi/17622
dc.description.abstract	Η αυξανόμενη ενσωμάτωση συστημάτων Τεχνητής Νοημοσύνης (AI) σε κρίσιμες υποδομές όπως η κυβερνοασφάλεια, η υγειονομική περίθαλψη, τα χρηματοοικονομικά και η εθνική άμυνα έχει αναδείξει σημαντικές προκλήσεις για τη διασφάλιση της ανθεκτικότητας των μοντέλων απέναντι σε επιθέσεις παραπλάνησης (adversarial attacks). Η παρούσα εργασία εξετάζει την ευπάθεια διαφόρων μοντέλων μηχανικής μάθησης (ML) σε τέτοιες επιθέσεις και διερευνά αποτελεσματικές τεχνικές ανίχνευσης και μετριασμού ώστε να ενισχυθεί η ανθεκτικότητά τους. Με τη χρήση του συνόλου δεδομένων CIC-IDS2017, εκπαιδεύτηκαν διάφορα μοντέλα μηχανικής μάθησης (ML)—όπως Decision Tree), Random Forest, Logistic Regression, XGBoost και ένα νευρωνικό δίκτυο υλοποιημένο σε PyTorch, που υποβλήθηκαν σε ένα σύνολο επιθέσεων παραπλάνησης όπως οι FGSM, PGD, DeepFool, Decision Tree Attack και Carlini-Wagner. Κεντρικός άξονας της μελέτης αποτελεί η αξιολόγηση τόσο άμεσων όσο και μεταφερόμενων επιθέσεων (transfer attacks), αποκαλύπτοντας ότι τα παραδοσιακά μοντέλα υπέστησαν σημαντική υποβάθμιση απόδοσης, ενώ τα βαθιά νευρωνικά δίκτυα επέδειξαν μεγαλύτερη ανθεκτικότητα. Για τη βελτίωση της αντοχής, εφαρμόστηκε εκπαίδευση με παραπλανητικά δείγματα (adversarial training), γεγονός που οδήγησε σε σημαντική αύξηση της ακρίβειας των μοντέλων υπό επίθεση—με το PyTorch μοντέλο να διατηρεί ακρίβεια άνω του 98% στις περισσότερες περιπτώσεις. Επιπλέον, ενσωματώθηκαν προηγμένες τεχνικές ανίχνευσης με χρήση του εργαλείου ART (Adversarial Robustness Toolbox), όπως οι Ανιχνευτές Δυαδικής Εισόδου και Ενεργοποίησης (Binary Input and Binary Activation Detectors). Οι ανιχνευτές αυτοί παρουσίασαν υψηλή ανάκληση και ακρίβεια στον εντοπισμό παραπλανητικών εισόδων (adversarial inputs), αν και η μέτρια απόδοσή τους σε καθαρά δείγματα υποδηλώνει έναν συμβιβασμό μεταξύ ασφάλειας και χρηστικότητας. Η υλοποίηση μιας αρχιτεκτονικής δύο επιπέδων ανίχνευσης αποδεικνύει μια πρακτική προσέγγιση άμυνας είς βάθους, ικανή να μπλοκάρει ή να επισημαίνει επικίνδυνες εισόδους πριν αυτές φτάσουν στον ταξινομητή. Η εργασία αυτή προσφέρει μια ολοκληρωμένη ανάλυση της ανθεκτικότητας των μοντέλων σε επιθέσεις παραπλάνησης στο πεδίο των συστημάτων ανίχνευσης εισβολών και προτείνει μια επεκτάσιμη αρχιτεκτονική που συνδυάζει εκπαίδευση με παραπλανητικά δείγματα και ανίχνευση σε πραγματικό χρόνο. Μελλοντικές ενέργειες μπορούν να επικεντρωθούν στη βελτίωση της ακρίβειας ανίχνευσης καθαρών δειγμάτων (Clean Samples), την ενσωμάτωση πιο ποικίλων συνόλων δεδομένων και την ανάπτυξη προσαρμοστικών αμυντικών μηχανισμών για αντιμετώπιση εξελισσόμενων επιθέσεων.	el
dc.format.extent	81	el
dc.language.iso	en	el
dc.publisher	Πανεπιστήμιο Πειραιώς	el
dc.rights	Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/gr/	*
dc.title	AI adversarial attack detection and mitigation for AI-based systems	el
dc.type	Master Thesis	el
dc.contributor.department	Σχολή Τεχνολογιών Πληροφορικής και Επικοινωνιών. Τμήμα Ψηφιακών Συστημάτων	el
dc.description.abstractEN	The increasing integration of Artificial Intelligence (AI) systems into critical infrastructure such as cybersecurity, healthcare, and finance has introduced significant challenges in ensuring model robustness against adversarial attacks. This thesis investigates the susceptibility of various machine learning (ML) models to adversarial manipulations and explores effective detection and mitigation strategies to enhance resilience. Using the CIC-IDS2017 dataset, a suite of ML classifiers—including Decision Tree, Random Forest, Logistic Regression, XGBoost, and a custom PyTorch-based neural network—were trained and subjected to a range of adversarial evasion attacks including FGSM, PGD, DeepFool, and Carlini-Wagner. A key focus of this research is the evaluation of both direct and transfer adversarial attacks, revealing that while traditional models suffered severe performance degradation, deep learning models exhibited stronger resilience. To improve robustness, adversarial training was employed, significantly enhancing model accuracy under attack, particularly for the PyTorch model, which retained over 98% accuracy in most cases. Furthermore, this study integrates advanced detection mechanisms using the Adversarial Robustness Toolbox (ART), including Binary Input and Binary Activation Detectors. These detectors demonstrated high recall and precision in identifying adversarial inputs, although moderate performance on clean samples suggests a trade-off between security and usability. The implementation of a dual-layer detection pipeline within a machine learning system illustrates a practical defense-in-depth approach, capable of blocking or flagging adversarial inputs before reaching the core classifier. This research contributes a comprehensive analysis of adversarial attack resilience in intrusion detection systems and proposes a scalable architecture for integrating robust training and real-time adversarial detection. Future work will focus on enhancing detection precision for clean samples, incorporating more diverse datasets, and exploring adaptive defenses to counter evolving attack strategies.	el
dc.contributor.master	Ασφάλεια Ψηφιακών Συστημάτων	el
dc.subject.keyword	Artificial intelligence	el
dc.subject.keyword	Adversarial attacks	el
dc.subject.keyword	AI-based systems	el
dc.subject.keyword	Mitigation of adversarial attacks	el
dc.subject.keyword	Detection of adversarial attacks	el
dc.date.defense	2025-04-07

Αρχεία σε αυτό το τεκμήριο

Name:: AI Adversarial Attack Detection ...
Μέγεθος:: 1.436Mb
Τύπος:: PDF
Description:: Master thesis

Προβολή/Άνοιγμα

Αυτό το τεκμήριο εμφανίζεται στις ακόλουθες συλλογές

Τμήμα Ψηφιακών Συστημάτων
Department of Digital Systems

Εμφάνιση απλής εγγραφής

Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα

Εκτός από όπου διευκρινίζεται διαφορετικά, το τεκμήριο διανέμεται με την ακόλουθη άδεια:
Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα