Reverse Data Management
Reverse-engineering data transformations to understand, diagnose, and manipulate data
Forward and Reverse Data Transformations
Data transformations, functions from an input data source to an output data source, are ubiquitous today and can be found in data integration, data exchange, and ETL tools. The natural evolution of data follows the directionality of the transformations, from source to target. Most database research focuses on forward-moving data flows: source data is subjected to transformations and evolves through queries, aggregations, and view definitions to form a new target data instance, possibly with a different schema. This forward paradigm underpins most data management tasks today, such as querying, data integration, data mining, clustering, and indexing. Database systems are particularly efficient at handling forward transformations, which typically generate new target data, as opposed to modifying the source data.
This project contrasts forward processing with reverse data management: the handling of reverse transformations that perform actions on the input data, on behalf of desired outcomes in the output data. Reverse transformations modify the source data rather than generate a new target data instance. Some examples of reverse transformations include updates through views, data generation, and data cleaning and repair. Reverse transformations are, by necessity, conceptually more difficult to define, and computationally harder to achieve. Today, however, as increasingly more of the available data is derived from other data, there is an increased need to be able to modify the input in order to achieve a desired effect on the output, motivating a systematic study of reverse data management.
Our goal is to develop Reverse Data Management techniques that facilitate:
- Understanding data and query results.
- Diagnosing errors in data systems.
- Manipulating data based on desirable outcomes.
Xiaolan Wang, Mary Feng, Yue Wang, Luna Dong, and Alexandra Meliou.
Data X-Ray: A Diagnostic Tool for Data Errors
Xiaolan Wang, Luna Dong, and Alexandra Meliou.
A Characterization of the Complexity of Resilience and Responsibility for Self-join-free Conjunctive Queries
Cibele Freire, Wolfgang Gatterbauer, Neil Immerman, and Alexandra Meliou
CoRR abs/1507.00674 2015.
Reverse Data Management
Alexandra Meliou, Wolfgang Gatterbauer, and Dan Suciu.
[ Paper], [Slides]