We are excited to announce that the paper "Incremental Detection of Denial Constraint Violations" is accepted to be presented at the 51st International Conference on Very Large Data Bases (VLDB) in 2025.
Authors:
Youri Kaminsky (Hasso Plattner Institute)
Eduardo Pena (Federal University of Technology – Parana)
Felix Naumann (Hasso Plattner Institute)
Abstract:
Denial constraints (DCs) are a well-known method for expressing business rules on data. They subsume other integrity constraints (ICs), such as key constraints or functional dependencies. One can either use a traditional DBMS or a specialized algorithm to validate such dependencies on the data. However, no known approach exists to detect DC violations incrementally. Data typically changes over time, and recomputing the entire violation set after every update is wasteful. Alerting data practitioners of data quality issues immediately enables them to take measures earlier and can help prevent follow-up issues.
We present Weever, the first incremental approach to detect all violations of a given set of DCs. It uses a novel specialized index structure to process inequality predicates. Moreover, we devise a new method to plan the execution order of predicates depending on their selectivity and reduce redundant computations when handling multiple DCs. Our evaluation shows that Weever outperforms a DBMS-based baseline by up to two orders of magnitude. And in the same time that a state-of-the-art static approach takes to analyze an entire dataset, Weever processes up to 200 000 insertions.