Online learning algorithms with application in ranked recommendations

Panagiotopoulou, Evgenia; Παναγιωτοπούλου, Ευγενία

Master Thesis

Author

Panagiotopoulou, Evgenia

Παναγιωτοπούλου, Ευγενία

Date

2019

Abstract

In this work we study the Online Learning and its application in ranked recommendations’ systems that use context. Nowadays, modern platforms, websites and applications create an increased need for recommendations’ systems that offer useful content suggestions. Online learning poses a great solution towards that purpose, as it can leave the customer – or user – satisfied, while requiring minimal computational resources, without demanding training or past data and with the ability to adapt quickly to new data. Furthermore, by introducing relevant context – side information – into an online learning recommendation system we can expect to produce content suggestions for the users that are appealing and tailored to their needs and interests. Specifically, over the course of this study we explore bibliographically the Multiarmed Bandit problem (MAB), the Contextual Bandits, the Rankings of Recommendations and the corresponding algorithms. In order to delve deeper into the online learning recommendations, we design and conduct experiments with our own generated artificial datasets, using the algorithms that we found the most interesting. Our idea was to combine the recommendation meta-algorithms RBA and IBA with instances of the linear rewards contextual algorithm LinUCB. As it is, the two cases we are comparing are the RBA-LinUCB – a single-click, diverse-rankings algorithm that has been tested experimentally before – and the IBA-LinUCB, which is a multiple-clicks algorithm that is being tested for the first time in this work, to our knowledge. In the results of our experiments it appears that the RBA-LinUCB has an increasingly better performance than the IBA-LinUCB, as an increase in the standard deviation of the arm rewards (SDR) of the MAB leads to a higher cumulative average regret by the IBA-LinUCB, while the RBA-LinUCB remains unaffected. Moving to another viewpoint, though, it appears that the IBA-LinUCB yields increasingly more clicks than RBA-LinUCB, as the average rate of rewards (ARR) of the arms increases. Finally, by monitoring the way the instances in the recommendation slots learn, it is revealed that the IBA-LinUCB slots learn much faster and more accurately than those of RBA-LinUCB. The above observations lead us to the fact that the IBA-LinUCB is expected to offer more substantial results and yield more clicks than the RBA-LinUCB, and thus constitutes a more effective solution when used in online contextual recommendation systems.

Postgraduate Studies Programme

Ψηφιακά Συστήματα και Υπηρεσίες

Department

Σχολή Τεχνολογιών Πληροφορικής και Επικοινωνιών. Τμήμα Ψηφιακών Συστημάτων

Number of pages

Language

English

URI

https://dione.lib.unipi.gr/xmlui/handle/unipi/12258

Collections

Τμήμα Ψηφιακών Συστημάτων

Show full item record

Except where otherwise noted, this item's license is described as
Attribution-NonCommercial-NoDerivatives 4.0 Διεθνές