A three-stage approach to descriptive data analysis - Identifying clusters and corresponding interpretable descriptions from large data sets

Autoren Mario Drobics
Ulrich Bodenhofer
Markus Mittendorfer
Titel A three-stage approach to descriptive data analysis - Identifying clusters and corresponding interpretable descriptions from large data sets
Buchtitel Proc. 7th COST Action 276 Workshop
Typ in Konferenzband
Ort Ankara, Turkey
Abteilung KVS
Monat November
Jahr 2004
SCCH ID# 427
Abstract

This paper presents a three-stage approach to data mining which puts special emphasis on the visualization andinterpretability of the results. In the first stage, the input data is represented by a self-organizing map in orderto allow visualization and to reduce the amount of data while removing noise, outliers, and missing values. Then thispreprocessed information is used to identify and display fuzzy clusters of similarity. Finally, descriptions close tonatural language are computed for these clusters in order to provide the analyst with qualitative information. This isaccomplished by generating fuzzy rules using an inductive learning method. The proposed approach is applied to imagesegmentation and labeling.