Clustering algorithm selection by meta-learning

Παναγιωτόπουλος, Γεώργιος

Master Thesis

Author

Παναγιωτόπουλος, Γεώργιος

Date

2020-02

Abstract

Data clustering attempts to classify a database into object groups based on the similarities between the objects in question. The quest for a good-quality solution can become a complex process because of its unsupervised existence. There is currently a wide range of clustering algorithms, and it can be a slow and expensive process to select the best one for a given problem. For every dataset that is related to clustering problems, there is an exhaustive procedure that requests from a Data Scientist firstly to test each clustering algorithm to find the most suitable one. A system that recommends the clustering algorithm and guides the user for selecting the right one would be a great tool that would provide significant benefits to the scientific community. Rice formulated the Algorithm Selection Problem (ASP) in 1976, which postulates that the output of the algorithm can be predicted based on the structural features of the problem. Meta-learning has been used successfully for recommendation tasks with algorithms. It uses machine learning to induce meta-models capable of predicting the best algorithm of a new dataset. Experimental results show that the recommendation improves with these meta-attributes. With a significant accuracy, it is presented that a system could indeed recommend a clustering algorithm for an “unknown” dataset only by examining its meta-attributes firstly. Also, this Master Thesis discusses the relevance to the recommendation of each meta-feature.

Postgraduate Studies Programme

Πληροφοριακά Συστήματα και Υπηρεσίες

Department

Σχολή Τεχνολογιών Πληροφορικής και Επικοινωνιών. Τμήμα Ψηφιακών Συστημάτων

Number of pages

Language

English

URI

https://dione.lib.unipi.gr/xmlui/handle/unipi/12630
http://dx.doi.org/10.26267/unipi_dione/53

Collections

Τμήμα Ψηφιακών Συστημάτων

Show full item record