Εφαρμογές της κατανομής Pareto σε δεδομένα αναλογισμού
View/ Open
Keywords
Κατανομή ParetoAbstract
The present thesis focuses on the Pareto distribution, the estimating methods of its parameters, and
the evaluation of them. The main objective of the thesis is to analyze and examine the Pareto
distribution based on real data, as well as to fit the distribution to this data.
Initially, an introduction to the Pareto distribution is provided, along with its significance
in various fields. The theoretical background of the distribution is examined, including its history,
mathematical form, measures of location, dispersion, and moments, as well as skewness and
kurtosis. Furthermore, the basic properties of the Pareto distribution are analyzed, and the survival
and mean residual life functions are presented.
In the third chapter, a series of methods for estimating the parameters of the Pareto
distribution are discussed. Firstly, the parameterization of the distribution is presented, which
includes the form of the distribution and its parameters. Using these estimation methods, it is
possible to estimate the parameters of the Pareto distribution and fit the distribution to observed
data.
The fourth chapter provides a detailed description of goodness-of-fit measures used for
evaluating the estimators, and the statistical functions used to compute them. These measures
include the Kolmogorov-Smirnov (KS) statistic, the Cramér-von Mises (CvM) statistic, and the
Anderson-Darling (AD) statistic. Each measure has its own statistical function used for its
computation and represents a quantitative measure of the fit of the estimators to the Pareto data.
The presentation of these statistical functions allows for accurate and objective evaluation of the
performance of the estimators.
In the fifth chapter, the difficulty of distributions to fully describe a dataset across its entire
range is analyzed. It is argued that distributions can adequately describe small or medium values
but not the large ones. The general forms of composite functions, specifically the exponential and
Pareto distributions, are presented, along with their respective hazard functions.
The sixth chapter focuses on the favorable estimators of the Pareto distribution. Initially,
an introduction to the topic is provided, and the goodness-of-fit measures used to assess the fit of
the Pareto model to the data are analyzed. An important measure of goodness-of-fit is the loss
function, which calculates the percentage of observations falling outside the fitted distribution.
Furthermore, the criterion of robustness is examined, which uses the variance of the estimators as
a measure of estimation instability, and various favorable estimators of the Pareto distribution are
presented. These include percentile estimators, truncated mean estimators, and generalized median
estimators. Each estimator has its own properties and limitations and is used according to the
application and available data.
The seventh chapter analyzes the application of all estimators to real data and presents
significant conclusions regarding the performance and accuracy of the estimators. Initially, the
selection of the sample and the methodology followed for the evaluation of the estimators are
described. The computation of the estimators is explained, and graphs depicting the initial fitting
of the data are presented. Subsequently, the conducted statistical tests and their results are
presented. Finally, a ranking of the estimators based on their robustness and performance is
performed to derive specific recommendations for selecting the optimal estimator. These analyses
and results summarize the reliability and performance of the estimators and contributes to the
better understanding of the model. The graphs, determination of estimators, and corresponding
tests were conducted using Mathematica.