Sentiment analysis for financial news

Savvas, Spyros; Σάββας, Σπύρος

Master Thesis

Συγγραφέας

Savvas, Spyros

Σάββας, Σπύρος

Ημερομηνία

2025

Περίληψη

This thesis explores the development and application of sentiment analysis techniques specifically tailored for financial news headlines, utilizing a dataset of 4,838 unique instances. The primary objective is to systematically evaluate and compare the performance of diverse methodologies, ranging from traditional lexicon-based approaches to state-of-the-art transformer architectures, for accurate sentiment classification (positive, negative, neutral) within the financial domain. Methodologically, the study implements and contrasts three distinct categories of models: (1) Lexicon-based classifiers utilizing the general-purpose VADER and the domain-specific Loughran-McDonald financial sentiment dictionary; (2) Fine-tuned Bidirectional Encoder Representations from Transformers (BERT) models, exploring variations in sentence representation through CLS, Mean, and Max pooling strategies, evaluated on a common imbalanced test set; and (3) The Gemma 7B-IT large language model, fine-tuned as a sequence classifier using parameter-efficient techniques (4-bit quantization and Low-Rank Adaptation - LoRA), also evaluated on the same common imbalanced test set. The models' performance is rigorously assessed using standard classification metrics, with a particular focus on the Macro F1-score to address observed class imbalance in the dataset. Experimental results unequivocally demonstrate the significant performance advantage of transformer-based models (BERT and Gemma) over lexicon-based approaches, which struggle with nuance and domain specificity. Fine-tuned BERT models achieve strong results (e.g., the CLS pooling strategy with non-stratified training data reached up to 88.2\% accuracy and a 0.87 Macro F1-score). The Gemma 7B-IT model, fine-tuned as a sequence classifier with stratified training data, also demonstrated comparable top-tier performance, achieving 87.2\% accuracy and a 0.87 Macro F1-score. This research contributes a valuable comparative benchmark for financial sentiment analysis, underscoring the effectiveness of modern, efficiently fine-tuned large language models and established transformer architectures for tackling domain-specific natural language processing tasks.

Τίτλος Προγράμματος Μεταπτυχιακών Σπουδών

Πληροφοριακά Συστήματα και Υπηρεσίες

Τμήμα

Σχολή Τεχνολογιών Πληροφορικής και Επικοινωνιών. Τμήμα Ψηφιακών Συστημάτων

Αριθμός σελίδων

108

Γλώσσα

Αγγλικά

URI

https://dione.lib.unipi.gr/xmlui/handle/unipi/18232

Συλλογή

Τμήμα Ψηφιακών Συστημάτων

Εμφάνιση πλήρους εγγραφής

Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα

Εκτός από όπου διευκρινίζεται διαφορετικά, το τεκμήριο διανέμεται με την ακόλουθη άδεια:
Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα