Ανάπτυξη συστήματος ερωταποκρίσεων και σύνοψης εγγράφων με αξιοποίηση γλωσσικών μοντέλων και τεχνικών διανυσματικής αναζήτησης
Development of conversational interface and text summarizer using large language models

View/ Open
Keywords
Απαντήσεις ερωτήσεων ; Σύνοψη ; RAG ; NLP ; FAISS ; Μοντέλα μετασχηματιστώνAbstract
The rapid proliferation of digital documents in many fields necessitates the development of effective and intelligent tools for automated extraction and summarization of information. Traditional search methods, based on keyword matching, often fail to provide accurate and contextually relevant answers due to a lack of semantic understanding. This limitation is particularly evident in large document repositories, where quick access to relevant content without manual searching is required.
This paper presents a question-answering (QA) system for PDF documents, utilizing artificial intelligence and Natural Language Processing (NLP) techniques, with the aim of intelligently retrieving information from the desired files. The system combines transformer-type language models with vector search for the analysis, processing, and extraction of meaningful conclusions from PDFs. The main elements are the conversion of document text into numerical form, its summarization and the question-answering module using generative artificial intelligence, designed to improve accuracy and efficiency.
The architecture, implementation details, and evaluation metrics will be analyzed, highlighting applications in research, business intelligence, and the academic community. Experimental results demonstrate the system's ability to accurately summarize content and provide contextually relevant answers, making it a reliable tool for intelligent document navigation. In addition, challenges such as handling complex queries, ensuring factual accuracy and scalability to support its implementation in an operational environment will be addressed.
Finally, future improvements are proposed which aim to enhance functionality and user experience. These include multilingual support, with the aim of making the system accessible to a wider audience, upgrading summarization techniques for greater accuracy and comprehensiveness in summaries, and improvements to the user interface (UI/UX) for a better user experience. The above improvements are expected to contribute to further increasing the efficiency and usability of the system.


