Τεχνικές προεπεξεργασίας και εκτίμησης DEA αποδοτικότητας
Preprocessing and approximate scoring DEA
Abstract
This thesis studies preprocessing and approximate scoring techniques to accelerate Data
Envelopment Analysis (DEA) for large-scale datasets. DEA evaluates the efficiency of decision
making units (DMUs) using linear programming, but its computational cost grows rapidly with
dataset size, making direct application difficult in Big Data settings. The main contribution of the
thesis is the development of MaxRatio method. It is based on an input–output ratio analysis to
approximate DEA efficiency scores without solving LPs. To address the exponential growth of
ratio combinations in the MaxRatio method and reduce the computational effort, sampling
techniques are investigated to evaluate and confirm its effectiveness. The use of a limited number
of ratios by the MaxRatio method through sampling led to important findings for understanding
the relationships among ratios. In particular, it showed a strong correlation between ratio-based
measures and DEA scores, and demonstrated that many ratio combinations lead to the
identification of the same DMUs. Also, the MaxRatio method is compared to the preprocessing
method KZCT-2019, which aims to quickly identify efficient DMUs. Experimental results show that
the MaxRatio method outperforms the KZCT-2019 method in the number of efficient units it
identifies as well as in the approximation of efficiencies, however it has higher complexity. Overall,
both preprocessing methods identify a satisfactory number of efficient DMUs and provide reliable
efficiency estimation. The results of this study are easily comparable with those of other studies
and enable reproducibility and independent verification by other researchers.


