Προσαρμογή και αξιολόγηση μοντέλου CLIP σε ιατρικά δεδομένα για δερματολογική διάγνωση

View/ Open
Keywords
CLIP μοντέλο ; Επεξεργασία φυσικής γλώσσας ; Τεχνητή νοημοσύνη ; Ιατρικά δεδομέναAbstract
This thesis focuses on the development and evaluation of a CLIP (Contrastive
Language-Image Pre-training) model adapted to dermatological medical data, aiming to
enable automatic classification and diagnosis of medical images based on their textual
descriptions. The ultimate goal is to create a system capable of processing a wide range
of medical images and assisting in the diagnostic process through the alignment of
images with relevant medical text.
Two datasets were used: ISIC, where the low proportion of pathological cases and
limited descriptive diversity hindered the model's ability to learn effectively, and SkinCAP,
where better results were achieved. In SkinCAP dataset text descriptions were
condensed using the T5-small model, and a variety of image and text processing
algorithms were tested. With a learning rate set to 10−6 and temperature to 0.06, the
model demonstrated improved performance after 25 training epochs, achieving
satisfactory alignment between images and accurate diagnoses, with small deviations at
the level of detailed description.
The results indicate that CLIP can be successfully adapted to medical data, offering a
promising approach to enhancing computer-assisted diagnostic systems. This work
highlights the importance of combining visual and textual information in the field of
medical informatics.

