Αντιμετώπιση σφαλμάτων ταξινόμησης (misclassification) στην ανάλυση κατηγορικών δεδομένων
SubjectError analysis (Mathematics) -- Econometric models ; Μεταβλητές (Μαθηματικά) ; Ταξινόμηση -- Μαθηματικά υποδείγματα ; Contingency tables
Misclassification, the erroneous measurement of one or several categorical variables, is a major problem met in conducting surveys concerning many scientific fields such as Medicine and Epidemiology. The observed data which are collected in studies are often subject to misclassification errors. Even in rather simple scenarios, unless the misclassification probabilities are very small, a significant amount of bias can arise in estimating the degree of association concerning common measures like risk difference, risk ratio and odds ratio. Misclassification can also lead to the reduction of efficiency in the analysis of contingency tables. The main aim of this MSc Dissertation is to present the effects of misclassification in univariate, bivariate and multivariate analysis of categorical data, as well as to demonstrate its effects to parameter estimation and hypothesis testing. Methods of adjusting for the effects of misclassification will be presented, including simple matrix and model-based methods using validation data, as well as adjustment methods using repeated data. These methods will be compared in order to find out which one performs better, depending on the available data set and sampling process. Finally, adjustment methods will be performed on a data set containing misclassified data from highway safety research in order to evaluate the effectiveness of seat belt use in reducing injuries in automobile accidents depending on factors such as car damage severity and sex.