Information retrieval and evaluation of the privacy risk on Twitter
View/ Open
Abstract
In recent times, a growing number of information retrieval applications are disposable, aiming to trace users’ online behavior and activities. One of the most popular social networks, which can be considered as a valuable source of information to this kind of applications, is Twitter. Aggregated data that derive from Twitter can show great power in delivering information related to users’ interests and preferences. The process of correlating information can result in the construction of comprehensive user profiles that may disclose detailed personal information and raise challenges to users’ privacy as well. Extracted behavioral patterns of users can be substantial to the development of personalization services, however, inevitably at the expense of users’ privacy. Although there are a number of privacy-enhancing technologies, which strive to mitigate many of these concerns, significant gaps remain regarding the privacy protection of users. In addition, it is essential to provide a comprehensive view on metrics which consist in quantifying privacy. Most of the efforts devoted to devise privacy metrics are quite limited, as they apply to concrete systems. The lack of suitable metrics is deterrent to the proper privacy evaluation. Therefore, even though proposed approaches have made meaningful contributions to the challenging privacy landscape, there still exists a certain ambiguity about their effectiveness and adjustment to different contexts. In this work, we present an effort towards the construction of user profiles, through the development process of an information retrieval application. We also tackle the privacy issues related to user profiling, as personal information contained in user profiles is disclosed. The last part of this thesis approaches the theme of quantifying user privacy by applying information - theoretic notions as measures of the privacy of user profiles.