Χρήση γενετικών αλγορίθμων για τον εντοπισμό κοινοτήτων και ομάδων χρηστών στα μέσα κοινωνικής δικτύωσης
Use of genetic algorithms to identify communities and user groups in social networks

View/ Open
Keywords
Γενιτικοί ; Αλγόριθμοι ; Κοινότητες ; Ομάδες χρηστών ; Μέσα κοινωνικής δικτύωσης ; CA ; GA ; Phantom communities ; Κυψελωτά αυτόματα ; Phantom groupsAbstract
The identification of communities and user groups within social networks – Communities & User-Groups Identification in Social Networks [25] – is a crucial challenge today in big data science, with applications ranging from marketing and recommendation systems for various thematic categories to cybersecurity and epidemiology.
A community in a social network is typically defined as a subset of nodes with denser connections within the group compared to the rest of the network. Due to the vast scale and complexity of modern social networks, there is a need for efficient and scalable methods to detect such groups. Traditional graph-based clustering algorithms such as hierarchical clustering, spectral clustering, and modularity-based clustering have been extensively explored but often face issues when analyzing large-scale, dynamic, and overlapping communities.
Evolutionary Algorithms, and more specifically Genetic Algorithms (GAs), have emerged as a powerful alternative approach to solve the problem of community and group detection.
Genetic Algorithms, introduced by John Holland (1975) [22], are a heuristic optimization method inspired by natural selection, widely used to solve complex combinatorial optimization problems. The flexibility of GAs allows them to explore vast spaces of possible solutions for identifying desired types or patterns of communities or groups within a social network (SN), while gradually evolving to more effectively detect such patterns.
They encode potential community structures as individuals in a population, applying crossover and mutation techniques to create new solutions. Then they evaluate these solutions using fitness functions based on metrics like modularity [23] (positively correlated) and conductance [24] (negatively correlated), among others, which assess how distinct a proposed group is from the rest of the network in terms of specific features.
Unlike deterministic traditional algorithms, GAs introduce stochastic (statistical) exploration of the SN, making them more suitable for detecting overlapping and dynamically evolving sub-groups and communities.
Early GA implementations focused on maximizing modularity in static networks, while more recent developments have incorporated multi-objective optimization, hybrid heuristic methods, and dynamic adaptation techniques to better address the evolving nature of real-world networks.
Recent research also explores the integration of GAs with Machine Learning (ML) techniques to improve the accuracy and computational efficiency of detection methods. However, challenges remain, including dynamic parameter tuning, convergence control (i.e., whether the running model is reaching a conclusion), and the computational cost of the running model.
Advantages of using GAs for community detection compared to traditional methods include:
Holistic search capability: GAs can effectively explore complex data environments to find high-fitness solutions (i.e., effective candidate community structures based on modularity and conductance), avoiding misleading local optima commonly encountered with traditional clustering methods.
Inherent parallelism: GAs are naturally parallelizable. In code terms, this means that using distributed computation (many-threads, many-cores, many-machines), they can efficiently handle large-scale networks.
Flexibility and adaptability: GAs allow integration of heuristic parameters, conditions, and constraints tailored to specific domains of social network research. This adaptability makes them suitable for specialized tasks such as fraud detection in massive financial/accounting datasets, political preference/trend analysis, and epidemiological modeling.
Hybrid models combining GAs with other optimization strategies have gained traction in recent years.
For instance, memetic algorithms incorporate local search mechanisms into GAs, significantly improving the refinement of community structures after generating initial solutions. Other hybrid approaches combine swarm intelligence methods such as Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO), showing enhanced accuracy and computational performance in large-scale datasets.
Beyond static networks (where nodes and edges remain relatively stable over time, e.g., Facebook), dynamic and evolutionary social networks (e.g., Twitter) pose additional challenges, requiring adaptive GA-based approaches.
Community structures in social networks evolve due to ongoing user interactions, emerging trends, and external influences. Adaptive Genetic Algorithms (AGAs) [26] have been developed to address this, incorporating time-dependent fitness evaluation mechanisms [27][28][29], gradually increasing learning mechanisms, and automated learning rate adjustments after each mutation. These methods are particularly applicable in real-time community tracking scenarios, such as crisis coordination, live social media analysis, and cyber threat detection.
This study aims to present a comprehensive literature review of GA applications in detecting communities and groups within Social Networks, referencing the evolution and current research in this field, and focusing on different implementation approaches. It will rely on the history of GA use in SNs to date, and will document and comparatively analyze the techniques employed, as well as current trends and developments. It will also highlight the gaps and limitations in these methods.
A key part of this work involves examining the interplay, computational philosophy, and mathematical/programmatic implementation of:
GA (Genetic Algorithms), a type of Evolutionary Algorithm, and
CA (Cellular Automata), which, while not strictly GAs, represent a different evolutionary model with stronger capabilities in localized detection,
in the context of their application to social network analysis.
Moreover, within the vast space of social networks, it has been shown that there exist "phantom" communities – groups whose associated edges are very large space-vectors, and whose nodes exhibit significant dispersion. We explore this special case briefly through a limited literature scope.
Our goal is to offer researchers a structured overview of how Genetic Algorithms and their extended implementations have contributed to the advancement of Social Network analysis [53], while also pointing out current gaps and challenges for future research and development.
Finally, we aim to implement such an algorithm computationally through the development of a C++ application that will analyze real social network data, in order to demonstrate the practical potential of these algorithms.