Ακολουθιακή ανάλυση αλυσίδων DNA
Παπά, Φραντζέσκα Δ.
The helix of DNA has become an important area of study in the field of Molecular Biology. The performance and the heredity of several organisms may be revealed by the study of small molecules that the DNA consists of, the nucleotides. For this reason, lots of databases have been developed, such as GenBank, where all publicly available DNA sequences are collected on a daily basis. In the present study, mathematical and statistical models that have been developed for the DNA sequences analysis are presented. The comparison of two DNA molecules is usually carried out by aligning the single strands of the molecules one on top of the other. Dynamic programming algorithms are a useful framework where the sequence matching problems can be embedded in order to find the optimal alignments. In that case, substitutions, insertions or deletions, even inversions are allowed in the alignment process. An important algorithm is presented for finding tandem repeats which have been considered that offer a strong evidence for the occurrence of several genetic diseases. Given the prominent significance of tandem repeats in biological studies, an interesting statistic closely associated to them is described in some detail. In that case, the sequences under study are taken to have the same length, they are aligned one on top of the other and are converted into a sequence of successes/matches. In the new bistate sequence what we are looking at is the total number of successes in success runs of length k or longer. In the present thesis we present several recently published results on its exact distribution evaluation.