Εμφάνιση απλής εγγραφής

dc.contributor.advisorΞενάκης, Χρήστος
dc.contributor.advisorXenakis, Christos
dc.contributor.authorΚουτρουμπούχος, Κωνσταντίνος
dc.contributor.authorKoutroumpouchos, Konstantinos
dc.date.accessioned2020-02-11T12:43:42Z
dc.date.available2020-02-11T12:43:42Z
dc.date.issued2020-02
dc.identifier.urihttps://dione.lib.unipi.gr/xmlui/handle/unipi/12613
dc.identifier.urihttp://dx.doi.org/10.26267/unipi_dione/36
dc.format.extent135el
dc.language.isoenel
dc.publisherΠανεπιστήμιο Πειραιώςel
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Διεθνές*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.titleHybrid machine learning architecture for phishing email classificationel
dc.typeMaster Thesisel
dc.contributor.departmentΣχολή Τεχνολογιών Πληροφορικής και Επικοινωνιών. Τμήμα Ψηφιακών Συστημάτωνel
dc.description.abstractENIn parallel to the increase of internet usage worldwide, phishing attacks incidents have increased analogically. One of the ways with which phishing attacks are initiated is through sending phishing emails. In this thesis an attempt at creating a novel technique for detecting phishing emails is presented. For that purpose, a solution using machine learning algorithms is explored. In order to design a machine learning model that properly identifies emails as phishing or benign, the following directions were studied: (a) The most widely used machine learning algorithms, with their representative parameters, advantages and disadvantages that their usage provides, (b) Developing an algorithm to test different combinations of machine learning feature inputs, (c) Different performance evaluation metrics of the created machine learning models and developing an algorithm for them and (d) The structure of the emails, their different characteristics and anything that can be used as a feature input for the machine learning models to be able to efficiently detect the pattern of phishing emails. To that end, two different fundamental feature categories were created: (i) Properties-based features, which are retrieved from the different characteristics of the emails, such as the number of URLs or attachments in the emails and (ii) Text-based features, which are retrieved from the text part of the email which is to be read by the receiver. Today, with the increase of computer processing power, efficient machine learning solutions are developed for an increasing number of problems. Although there is related work towards phishing email classification using machine learning techniques, this work proposes a novel hybrid technique using the two aforementioned types of features. To that end, two architectures were proposed: (a) A hybrid “assembled” architecture, which assembles the properties-based and the text-based features as one consistent feature vector which is then used as input for the classification process and (b) A hybrid “stacked” architecture, which has two classifiers: the first one classifies the email using text-based features and the second one (which outputs the final classification) uses the properties-based features and one additional feature which is the classification output of the first classifier for the to-be-classified email. The most widely used tools for developing machine learning based programs were explored and Apache Spark with its MLlib library were used. All the algorithms were written using the Python programming language. After developing an algorithm for testing every combination of classifier, their different hyperparameter values and the different architectures, it was found that, compared to the classification using only one type of features, the hybrid “assembled” architecture has an improved performance but the hybrid “stacked” architecture has a slightly reduced performance.el
dc.contributor.masterΑσφάλεια Ψηφιακών Συστημάτωνel
dc.subject.keywordMachine learningel
dc.subject.keywordClassificationel
dc.subject.keywordPhishingel
dc.subject.keywordEmailel
dc.date.defense2020-02-04


Αρχεία σε αυτό το τεκμήριο

Thumbnail

Αυτό το τεκμήριο εμφανίζεται στις ακόλουθες συλλογές

Εμφάνιση απλής εγγραφής

Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές
Εκτός από όπου διευκρινίζεται διαφορετικά, το τεκμήριο διανέμεται με την ακόλουθη άδεια:
Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές

Βιβλιοθήκη Πανεπιστημίου Πειραιώς
Επικοινωνήστε μαζί μας
Στείλτε μας τα σχόλιά σας
Created by ELiDOC
Η δημιουργία κι ο εμπλουτισμός του Ιδρυματικού Αποθετηρίου "Διώνη", έγιναν στο πλαίσιο του Έργου «Υπηρεσία Ιδρυματικού Αποθετηρίου και Ψηφιακής Βιβλιοθήκης» της πράξης «Ψηφιακές υπηρεσίες ανοιχτής πρόσβασης της βιβλιοθήκης του Πανεπιστημίου Πειραιώς»