AI adversarial attacks
Προβολή/ Άνοιγμα
Λέξεις κλειδιά
Adversarial attacks ; AI models ; Machine learning ; Datasets ; Extraction attacks ; Poisoning attacks ; Evasion attacks ; Inference attacksΠερίληψη
Machine learning models, and in particular deep neural networks, are now widely deployed in applications that demand high levels of accuracy and reliability. However, over the past decade, researchers have shown that these systems are not inherently robust, as they are vulnerable to adversarial interventions that can manipulate their behavior in subtle but highly effective ways. This growing body of research, known as adversarial machine learning, has revealed a broad range of attack strategies that question both the trustworthiness and the security of modern AI.
This thesis presents a comprehensive exploration of adversarial threats by examining forty distinct attack methods that cover the principal categories in the field: model extraction, data poisoning and backdoors, evasion through adversarial examples, and privacy or inference attacks. The study systematically evaluates the impact of these attacks on neural networks and goes beyond traditional accuracy measurements by employing richer indicators of performance, including precision, recall, F1-score, attack success rate, and confusion matrix analysis.
The results highlight not only the diversity of adversarial techniques but also the common vulnerabilities they exploit. By presenting these attacks within a unified and reproducible framework, the thesis provides both a benchmark for future research and a practical resource for those seeking to understand the risks posed by adversarial machine learning. In doing so, it contributes to a more comprehensive view of the challenges involved in building secure and trustworthy AI systems.


