Συγκριτική μελέτη παραγωγικών ανταγωνιστικών δικτύων για την εξισορρόπηση κλάσεων σε ιατρικές εικόνες

View/ Open
Keywords
Τεχνητή νοημοσύνη ; Ανταγωνιστική μάθηση ; Βαθιά μάθηση ; Εμπλουτισμός δεδομένων ; Γεννητικά ανταγωνιστικά δίκτυα ; Παραγωγικά ανταγωνιστικά δίκτυα ; ΑλτσχάιμερAbstract
The development of technology, especially machine learning, seems to be advancing rapidly in recent years, with techniques such as deep learning and computer vision being used to solve various problems across diverse fields. Data collection poses a significant challenge in many sectors, as obtaining reliable and sufficient data requires effort and time from domain experts. Specifically, in the scientific field of medicine, collecting health-related data from negative samples might be relatively easier compared to collecting data from positives, especially when dealing with rare diseases. However, this creates an imbalance issue in the dataset, which complicates the utilization of machine learning techniques.
To address the problem of imbalance, various techniques like data augmentation have been proposed. This thesis suggests tackling the issue using Generative Adversarial Networks (GANs). GANs are machine learning models that generate realistic artificial images. They consist of two neural networks, the Generator, and the Discriminator. The Generator aims to produce artificial images like real ones, while the Discriminator tries to distinguish between artificial and real images. Their training involves a competition between the two networks, iteratively improving them until the artificial images closely resemble real ones.
The study focuses on the OASIS-3 dataset, which includes Magnetic Resonance Imaging (MRI) brain images from healthy individuals and Alzheimer's patients. The experimental process begins with data acquisition and processing. Various GAN architectures are then tested to create artificial patient images, with the goal of selecting the most suitable architectures. Subsequently, artificial images generated from the selected architectures are added to the original dataset, testing five different scenarios involving varying percentages of artificial image insertion. Finally, a ResNet18 classifier is trained using the enriched datasets, and the results are compared to those from the original dataset.
As observed after an extensive study and analysis of the research results, the addition of artificial images appears to have a minor impact on the classifier's performance, showing minimal improvement in metrics with the addition of a small percentage of artificial images (10%-20%) to the class of AD patients. At higher percentages of artificial images, the classifier fails to identify instances of the minority class, while concurrently, the metrics exhibit a gradual deterioration.