Change Exploration

The Janus (IANVS) Project

Data change, all the time. In this project we want to explore and understand those changes. We call this activity change exploration:  For a given, dynamic dataset, we want to efficiently capture and summarize changes at instance-, and schema-level, enable users to effectively explore this change in an interactive and graphical fashion and analyze patterns in the changing data.

The art of exploration is to preserve order amid change and to preserve change amid order.      (adapted from Alfred North Whitehead)

Change-cube

We choose a generic model to represent changes to a dataset. It includes the following four dimensions to represent what changed where, when, and how:

  1. Time
  2. Entity (ID)
  3. Property
  4. Value

A change c is a quadruple of the form

<Time, ID, Property, Value> or in brief <t, id, p, v>.

Its semantics is: At time t the property p of the entity identified with id was created as or changed to v. A change-cube is a set of such changes. For more details on our data model see our vision paper at VLDB 2019 (see below).

Sources

Team

  • Project lead: Prof. Felix Naumann
  • Doctoral researchers: Tobias Bleifuß and Leon Bornemann
  • In collaboration with: Dmitri V. Kalashnikov, and Divesh Srivastava – AT&T Labs - Research

Former members

  • Student assistant: Joana Bergsiek, Kshitij Kumar, Hung Nguyen
  • Collaborators: Theodore Johnson – AT&T Labs - Research

Student projects

  • Master project: Vandalism Detection in Wikipedia Table Revisions
  • Bachelor project: Unit Testing Data for Machine Learning (with Amazon Research Berlin)
  • Master project: Discovering Change Dependencies