Ανάλυση συναισθημάτων σε πλατφόρμες κοινωνικών δικτύων μέσω τεχνικών επεξεργασίας φυσικής γλώσσας
Sentiment analysis in social media platforms through NLP techniques

View/ Open
Keywords
Ανάλυση συναισθημάτων ; Transformers ; BERT ; Κοινωνικά δίκτυα ; Πολυετικετική ταξινόμηση ; Μηχανική μάθηση ; Βαθιά μάθηση ; Sentiment analysis ; Social media ; Multilabel classification ; Machine learning ; Deep learning ; Επεξεργασία φυσικής γλώσσας ; Natural language processingAbstract
The analysis of sentiment and emotion in social media content has become an essential tool for gauging
public opinion and societal trends across diverse domains such as business, healthcare, and politics. This
dissertation examines the effectiveness of transformer-based deep learning models, with a particular
focus on BERT (Bidirectional Encoder Representations from Transformers), for the task of fine-grained
emotion classification in social media text. Utilizing the GoEmotions dataset, which consists of Reddit
comments annotated with 28 distinct emotion categories, a comprehensive multi-label emotion detection
pipeline is developed and evaluated.
The methodology encompasses rigorous data preprocessing, strategies for addressing class imbalance,
and systematic hyperparameter optimization to fine-tune BERT for multi-label emotion classification.
Experimental results indicate that model performance, as measured by micro and macro F1 scores and
overall accuracy, improves substantially with increased training data and extended training duration. Error
analysis further highlights persistent challenges, including class imbalance and the detection of
ambiguous or overlapping emotional expressions.
To assess real-world applicability, the trained model is deployed on contemporary X data(previously
known as Twitter), enabling emotion monitoring within live social media streams. The findings underscore
both the potential and current limitations of deep learning approaches for emotion analysis in noisy, user
generated text. Recommendations are provided for future research, including advanced techniques for
managing data imbalance, enhancing domain adaptation, and addressing ethical considerations
associated with large-scale emotion recognition.