An approach on ETL attached data quality management

Autoren Christian Lettner
Reinhard Stumptner
Karl-Heinz Bokesch
Editoren L. Bellatreche
M. K. Mohania
Titel An approach on ETL attached data quality management
Buchtitel Data Warehousing and Knowledge Discovery - Proc. DaWaK 2014
Typ in Konferenzband
Verlag Springer
Serie Lecture Notes in Computer Science
Band 8646
ISBN 978-3-319-10159-0
Monat September
Jahr 2014
Seiten 1-8
SCCH ID# 1406
Abstract

This contribution introduces an approach on ETL attached Data Quality Management by means of an autonomous Data Quality Monitoring System. The Data Quality Monitor can be attached (via light-weight connectors) to already implemented ETL processes and allows to quantify data quality and to suggest measures if the quality of a particular data package falls below a certain limit for instance. Furthermore, the long-term vision of this approach is to correct corrupted data (semi-)automatically according to user defined Data Quality Rules. The Data Quality Monitor can be attached to an ETL process by defining "snapshot points", where data samples which should be validated are collected and by introducing "approval points", where an ETL process can be interrupted in case of corrupted input data. As the Data Quality Monitor is an autonomous module which is attached to instead of embedded into ETL processes, this approach supports the division of work between ETL developers and special data quality engineers.