Ανακάλυψη ψευδών (μποτ/σπαμ) ιστοσελίδων
On fake (bot/spam) website detection

View/ Open
Keywords
Ανίχνευση κακόβουλης διεύθυνσης URL ; Μηχανική μάθηση ; Βαθιά μάθηση ; Επιθέσεις ψεύτικων ιστοσελίδων ; Ταξινόμηση ιστοσελίδας ; Malicious URL detection ; Machine learning ; Deep learning ; Fake website attacks ; Website classificationAbstract
This thesis analyzes some of the types of fake websites that exist, the structure of URLs,
existing detection techniques and the use of algorithms to detect fake (bot/spam) websites,
which is a significant problem in cybersecurity. The internet has become an integral part of
modern life, facilitating communication, commerce, education and entertainment. However, this
interconnected digital word is also vulnerable to exploitation by malicious actors. Fake websites
(bot/spam websites) represent a persistent and evolving threat. These websites can mimic
legitimate platforms, distribute malware and engage in other illegal activities. Often, they appear
in various forms, such as phishing sites, malware distribution pages, and spam websites
created solely to mislead search engines and users. Detecting these malicious websites is a
critical challenge in protecting individuals, businesses and organizations. URLs are classified
into Natural Language Processing (NLP) problems, so advanced Machine Learning and Deep
Learning techniques such as Random Forest and LSTM can analyze websites patterns and
characteristics to provide more efficient classification. In this thesis, various algorithms and
hybrid forms of them (such as CNN-LSTM) are applied to train the dataset with URL-based
features of dangerous and legitimate websites.


