A three-stage approach to descriptive data analysis - Identifying clusters and corresponding interpretable descriptions from large data sets

Autoren Mario Drobics
Ulrich Bodenhofer
Markus Mittendorfer
TitelA three-stage approach to descriptive data analysis - Identifying clusters and corresponding interpretable descriptions from large data sets
BuchtitelProc. 7th COST Action 276 Workshop
Typin Konferenzband
OrtAnkara, Turkey
AbteilungKVS
MonatNovember
Jahr2004
SCCH ID#427
Abstract

This paper presents a three-stage approach to data mining which puts special emphasis on the visualization andinterpretability of the results. In the first stage, the input data is represented by a self-organizing map in orderto allow visualization and to reduce the amount of data while removing noise, outliers, and missing values. Then thispreprocessed information is used to identify and display fuzzy clusters of similarity. Finally, descriptions close tonatural language are computed for these clusters in order to provide the analyst with qualitative information. This isaccomplished by generating fuzzy rules using an inductive learning method. The proposed approach is applied to imagesegmentation and labeling.