Photography style analysis using Convolutional Neural Networks
This thesis studies the artistic nature of photography and tries to construct a framework for the definition of the term “photography style”. It goes deep into the history of photography and analyzes a plethora of aesthetics that have been carved throughout the ages. Through this journey it collects the most important rules of aesthetics and groups them in specific categories. Then, with the help of deep learning and computer vision, it is able to train and predict on those specific categories. Firstly, the reader is introduced to the world of photography. We present its historical background and then focus on its huge rise in the era of the social media. We then analyze some basics of photography, as well as some of the most known rules of aesthetics. We finally emphasize on the difficulty to bind those rules into a specific problem with specific tasks due to the subjectivity of photography and arts in general. We then present a novel dataset of photographs annotated in terms of the respective image aesthetics. We also examine the ability of Convolutional Neural Networks (CNNs) to distinguish between the adopted photography style classes. In particular, we have defined five photography style classification tasks, related to the following aesthetic attributes: Color, Depth of Field (DoF), Palette, Composition and Type. We then followed an annotation procedure using on a set of 1832 photos selected from the Unsplash Full dataset. Multiple annotators have also been used, in order to measure inter-annotator agreement. As soon as the dataset was compiled, we trained and evaluated a Residual Neural Network (ResNet50). The experimental results prove that, despite the imbalanced dataset, our model was able to achieve acceptable classification results. The dataset is openly provided, along with the trained models and Python code to use them.