Personalized K-NN search over bimodal vectors

Master Thesis
Author
Petrou, Maria
Πέτρου, Μαρία Ελευθερία
Date
2025-03View/ Open
Keywords
Vector search ; Bimodal vector search ; FAISSAbstract
If we wanted to answer the question:
"Which movies are most similar to a given movie, if the user can define the characteristic by which similarity is measured? For example, the plot or the movie poster?"
In this work, we address the problem of K-nearest neighbor (K-NN) search in high-dimensional data consisting of images and text (bimodal vector search). The distance we use as a metric is dynamically adjusted using a weight parameter λ ∈ [0,1], which the user defines. By introducing λ, we give the user the ability to balance the importance between image and text, providing personalized results that align with their preferences.
To solve this problem, we propose an algorithm that:
• Transforms the initial image and text embeddings into a unified vector space, using a transformation step that depends on λ. The transformation step is designed so that distances in the transformed space closely approximate the actual distances in the original space.
• Selects the best available index, ensuring high accuracy while minimizing query response time.
• Defines a minimal set of precomputed indexes, ensuring that for every value of λ, the accuracy remains above 80%.
Our approach explores the capabilities of FAISS, a highly efficient similarity search library, in the context of bimodal image-text data, an area that has not been extensively studied. Our data is transformed into unified multimodal embeddings, dynamically adjusted based on λ. Our goal is to develop a system that delivers personalized and accurate results while maintaining computational efficiency.