Εξόρυξη γνώσης & ανάκτηση δεδομένων εικόνας με χρήση υποδομών σχεσιακών βάσεων δεδομένων
Λεπίδας, Αλέξανδρος Ε.
SubjectΕξόρυξη δεδομένων ; Data mining ; Βάσεις δεδομένων -- Διαχείριση ; Επεξεργασία εικόνας -- Ψηφιακές τεχνικές ; Database management ; Image processing -- Digital techniques
Organizations use nowadays digital imaging systems expecting to improve their effectiveness and multimedia presentation business and operation-wise. The digital image allows the capture, storage, retrieval and sharing of a huge number of recordings to network infrastructure and the Internet in general. Users can typically find a file with a digital imaging system faster than they can find the printed version or the respective microfilm. They can also share files easily using various infrastructures such as e-mail and instant messaging. On the other hand there is the need for increased storage space for images and photos. Although this last point is that primarily emphasizes the real advantage of digital imaging is the online access to files and the exchange of relevant information. The decision on whether and how to implement an imaging system is complex. Many factors must be considered. Primarily, what is the desired result? How will the display resolution of user problems? Will cover the real needs and how to integrate the existing infrastructure and are there sufficient financial resources to support systems over time? This study focused on the concepts of digital imaging and in particular in the storage and successful categorized search / retrieval. In particular there is a study of the import images process into a relational database (in specific Oracle Corporation RDBMS), a successful conversion to suitable format for the search patterns in these images, creating the data mining and conducting experiments to conclude to related findings. The decision to implement an image-processing system should be based on needs arising from the specific application requested. The key to successful design, development and implementation of a system for processing image data to find what is the correct analysis. The four main phases are: a) planning and analysis requirements, b) analysis of the technology chosen, c) process implementation, d) analysis of results. Observing that the process of creating, training and applying data mining algorithms is not a standardized procedure, which cannot comprise a uniform solution to all problems that require searching and processing large data sets. For this reason, it is understood that continuous study and research development in this sector is essential to the broad and heterogeneous range of problems requiring data mining. More specifically, applications that can only extract data sets of images - either medical or different categories as in our case - is perceived to be too many. Since applications for registration, search, image processing on the Internet, management of databases with medical images in large health facilities, and automatic comparison of these images to suggest for diagnosis / indications of certain pathogenicity. In this context, we studied four different algorithms (decision trees, naive bayes, support vector machines and logistic regression) for a total of approximately one thousand images, and a corresponding classification of those. The results are satisfactory, although there is always room for improvement. Besides, a data mining algorithm that achieves very high success rate for a specific problem posed by the chance to be over-optimized for the specific problem, making it potentially unsuitable for a wide range of problems. Customizing also showed that for the algorithms applied increasing the percentage of the available data in a training set of algorithms performance improved the performance of the algorithm significantly.