Emotion recognition on scenes of films based on the speech and the image
Master Thesis
Author
Tzagkarakis, Eleftherios
Τζαγκαράκης, Ελευθέριος
Date
2023-12View/ Open
Keywords
Τεχνητή νοημοσύνη ; Μηχανική μάθηση ; Artifical Intelligence ; Machine learningAbstract
This thesis delves into the fascinating realm of experimentation and evaluation, exploring a diverse array of machine learning models applied to both the auditory and visual domains. Specifically, the focus is on emotion recognition within public datasets comprising photographs and speech excerpts. The research progresses to the discernment of optimal models, which are subsequently deployed on cinematic scenes featuring monologues. This allows for a comprehensive comparison of the outcomes produced by these two models, scrutinizing the consistency and correlation of their predictions.
The ultimate objective of this endeavour is to fashion an intelligent director, empowered by the capabilities of machine learning. This directorial intelligence extends beyond conventional boundaries, making decisions on whether a scene warrants a reiteration, particularly when the results of the two models exhibit disparities. The implementation of this groundbreaking approach integrates the training of open-source neural networks alongside the utilization of classical machine learning algorithms.
This multifaceted exploration underscores the fusion of innovative technologies and traditional methodologies, establishing a robust framework for the advancement of intelligent cinematic direction. The synergy between open-source neural networks and classical machine learning algorithms not only contributes to the evolution of film production methodologies but also charts new territories in the intersection of artificial intelligence and artistic expression.