Μεγάλα γλωσσικά μοντέλα & ανάκτηση με ενισχυμένη γενετική παραγωγή στην ιστορική έρευνα
Large language models & retrieval-augmented generation in historical research

View/ Open
Keywords
Μεγάλα γλωσσικά μοντέλα ; Retrieval-augmented generation ; Ψηφιακές ανθρωπιστικές επιστήμες ; Ιστορική έρευνα ; Δεοντολογία ; Ενεργειακό αποτύπωμα ; LLM ; RAG ; AIAbstract
This thesis explores the role of Large Language Models (LLMs) and the Retrieval-Augmented Generation (RAG) methodology in historical research and education. It begins with the theoretical background of language models, their evolution, and the importance of the Transformer architecture. The technical aspects of RAG are then analyzed, with a focus on tools such as FAISS, optimization techniques, and comparative assessments of frameworks (LangChain, LlamaIndex, Haystack). The study further examines the application of LLMs in Digital Humanities, addressing issues of cross-lingual research, educational uses, and pedagogical risks. Through case studies (Greek Revolution of 1821, WWII Occupation, Byzantine Empire, Cold War, Asia Minor Catastrophe, Metapolitefsi 1974), practical examples demonstrate both the potential and limitations of LLM- and RAG-based pipelines. Ethical considerations (digital revisionism, GDPR, copyright), robustness against adversarial attacks, and sustainability issues regarding energy consumption are also discussed. The thesis concludes that LLMs and RAG pipelines constitute powerful tools for historical research, provided they are employed with scholarly oversight, transparency, and adherence to ethical principles.

