Content-based information retrieval & anonymisation in data & multimedia streams
KeywordsΑνωνυμοποίηση δεδομένων ; Ιδιωτικότητα πολυμέσων ; Υδατογράφηση ; Ασφάλεια εφαρμογών κινητών τηλεφώνων ; Data anonymisation ; Privacy - preserving data publishing ; K-anonymity ; Km-anonymity ; Multimedia privacy ; Watermarking ; Mobile application security
Personal data is any information that can be used to identify a person. These data can take different forms such as a field in a database, a unique number, a photograph or a network packet. Personal data about individuals is collected and manage from organizations in private and public sector. Businesses and the scientific community are both hungry for data. With the advanced algorithms of data mining original knowledge can be revealed from this data. Therefore, these data cannot be disseminated carelessly because the danger of breaching individuals’ privacy is always present. A scientific area of data anonymisation was emerged to protect the privacy and several methods have been proposed to guarantee data privacy in published datasets. There is a trade-off to apply anonymisation algorithms. The anonymisation process should balance between the protection of the privacy of the individuals and the usefulness of the released dataset. These anonymisation technics are analyzed and their strong and weak points are highlighted. The contribution of this thesis on data publishing scentific field is twofold. First, the introduction of a new attack on anonymised data, called inference of QIs attack}, which shows that an automated anonymisation solution, especially for medical records, is difficult without taking into account the semantics of the data and without consulting experts in this field. Second, the development of an algorithm which implements the km-anonymisation by taking into account the properties of continuous attributes and without giving a generalisation hierarchy. We conduct experiments which show that our algorithm preserves more information in the published dataset in comparison to other anonymisation algorithms that use generalisation hierarchy trees. Multimedia is another type of personal data that also examined in this research. From multimedia that is shared on Online Social Networks derives multiple privacy risks. An analytic survey of these risks is presented and a solution based on digital watermarking has been proposed towards to elimination of many of these risks. Smartphones increasing compute capabilities are paired with sensors, such as GPS, offering new opportunities to develop mobile applications with new potentials. To demonstrate how the privacy of a user can be breached we focused on a privacy-sensitive domain of apps, the dating apps. The research is based on the transmitted network packets and the results are worrying.