Link discovery on the web of data: time and space efficient large scale link discovery using string similarities

Καράμπελας, Ανδρέας; Karampelas, Andreas

Master Thesis

Συγγραφέας

Καράμπελας, Ανδρέας

Karampelas, Andreas

Ημερομηνία

2017-11-02

Περίληψη

This work proposes and evaluates a time and space efficient approach for computing links between a source data set and a target dataset by exploiting string similarities among entities’ properties. The proposed approach builds on a basic indexing method that facilitates pruning dissimilar pairs and supports effective verification of candidate pairs. It proposes a blocking method that organizes the target data set appropriately, to perform queries concerning matching a specific string. It supports an effective filtering approach that uses three filters that lead to a relatively small amount of candidate strings that need verification. Lastly, for the verification of each candidate string it uses an optimized algorithm for computing the edit distance between two strings. Evaluation results show the time and space efficiency of the proposed method against state-of-the-art approaches for link discoveries.

Τίτλος Προγράμματος Μεταπτυχιακών Σπουδών

Ψηφιακά Συστήματα και Υπηρεσίες

Τμήμα

Σχολή Τεχνολογιών Πληροφορικής και Επικοινωνιών. Τμήμα Ψηφιακών Συστημάτων

Αριθμός σελίδων

Γλώσσα

Αγγλικά

URI

https://dione.lib.unipi.gr/xmlui/handle/unipi/10681

Συλλογή

Τμήμα Ψηφιακών Συστημάτων

Εμφάνιση πλήρους εγγραφής

Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές

Εκτός από όπου διευκρινίζεται διαφορετικά, το τεκμήριο διανέμεται με την ακόλουθη άδεια:
Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές