GPTIPSpy : A Python module for genetic programming
GPTIPSpy : Ανάπτυξη μιας βιβλιοθήκης Python για γενετικό προγραμματισμό
Master Thesis
Author
Kalampokis, Evangelos
Καλαμπόκης, Ευάγγελος
Date
2025-01Keywords
GPTIPSpy ; GPTIPS2 ; Genetic programming ; Symbolic regressionAbstract
This thesis presents GPTIPSpy, an implementation of GPTIPS2, a MATLAB library for symbolic regression, as a Python package. Advances in technology have enabled researchers to create large datasets of variable measurements across a wide range of environments. These raw measurements, however, do not constitute scientific knowledge. Processing these datasets is necessary in order for scientific knowledge to be distilled. Artificial intelligence and machine learning help researchers uncover the intrinsic relationship between the variables of the system under study, analyzing data into scientific knowledge with the use of a variety of techniques. Symbolic regression is such a technique that aims to discover mathematical expressions that relate the independent variables to the dependent variable. It does this through a process similar to genetic programming. Starting with a population of randomly generated expressions, all expressions are assigned a fitness score based on how well they approximate the value of the dependent variable. The best members of the population are used as building blocks of the members of the next generation, which in turn are evaluated and used as basis for the creation of further generations of expressions, until a satisfactory solution is met. GPTIPSpy enables researchers to perform symbolic regression using a genetic programming algorithm in Python. Its highly configurable interface allows researchers to tailor the genetic algorithm to their specific needs and it offers them a wide range of post-run analysis tools that help analyze the genetic algorithm's output through visualization and inquiry. Following the presentation of the development process of GPTIPSpy, test-run results are shown, demonstrating that GPTIPSpy performs genetic programming correctly and produces similar results to GPTIPS2 in shorter processing times, making it an effective alternative.