Επεξεργασία δεδομένων μεγάλου όγκου σε πραγματικό χρόνο με μηχανική μάθηση, χρησιμοποιώντας Apache Spark και μικροϋπηρεσίες
Real-time big data processing with machine learning using Apache Spark and microservices

View/ Open
Keywords
Big data ; Microservices architecture ; Apache Spark ; Spring Boot ; ASP. NET 8.0 ; Angular ; Domain-driven design ; RedisAbstract
The exponential growth of big data has presented significant challenges in processing and analyzing vast datasets efficiently. Traditional approaches often struggle to meet the demands of real-time processing and delivering actionable insights. This thesis explores the design and implementation of a scalable, modular system capable of processing big data files in real time, leveraging machine learning techniques to extract meaningful insights. Built upon a microservices architecture, the system prioritizes modularity, fault tolerance, and scalability to ensure robust performance.
Key components of the architecture include advanced data processing frameworks, distributed computing principles, and cutting-edge software engineering practices. These elements work in harmony to enable seamless data flow, efficient computation, and reliable storage. The system also integrates secure mechanisms for user authentication and authorization, centralized log management for enhanced monitoring, and interactive data visualizations via a responsive web interface. By combining real-time analytics with machine learning capabilities, this work addresses critical challenges in the big data domain and offers a flexible platform for future innovations and extensions.