Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

The Janus (IANVS) Project

Data change, all the time. In this project we want to explore and understand those changes. We call this activity change exploration:  For a given, dynamic dataset, we want to efficiently capture and summarize changes at instance-, and schema-level, enable users to effectively explore this change in an interactive and graphical fashion and analyze patterns in the changing data.

The art of exploration is to preserve order amid change and to preserve change amid order.      (adapted from Alfred North Whitehead)

Change-cube

We choose a generic model to represent changes to a dataset. It includes the following four dimensions to represent what changed where, when, and how:

  1. Time
  2. Entity (ID)
  3. Property
  4. Value

A change c is a quadruple of the form

<Time, ID, Property, Value> or in brief <t, id, p, v>.

Its semantics is: At time t the property p of the entity identified with id was created as or changed to v. A change-cube is a set of such changes. For more details on our data model see our vision paper at VLDB 2019 (see below).

Sources

Team

Former members

  • Student assistant: Joana Bergsiek, Kshitij Kumar, Hung Nguyen
  • Collaborators: Theodore Johnson – AT&T Labs - Research

Publications

  • Matching Roles from Tempo... - Download
    [1]Bornemann, Leon, Tobias Bleifuß, Dmitri V. Kalashnikov, Fatemeh Nargesian, Felix Naumann, and Divesh Srivastava. Matching Roles from Temporal Data: Why Joe Biden is Not Only President, but Also Commander-in-Chief. Proceedings of the ACM on Management of Data (PACMMOD). 1(1):1–26, 2023. DOI:https://doi.org/10.1145/3588919.
     
  • Detecting Stale Data in W... - Download
    [2]Barth, Malte, Tibor Bleidt, Martin Büßemeyer, Fabian Heseding, Niklas Köhnecke, Tobias Bleifuß, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, and Divesh Srivastava. Detecting Stale Data in Wikipedia Infoboxes. In Proceedings of the International Conference on Extending Database Technology (EDBT), 2023.
     
  • The Secret Life of Wikipe... - Download
    [3]Bleifuss, Tobias, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, and Divesh Srivastava. The Secret Life of Wikipedia Tables. In Proceedings of the 2nd Workshop on Search, Exploration, and Analysis in Heterogeneous Datastores (SEAData), co-located with VLDB, 2021.
     
  • Structured Object Matchin... - Download
    [4]Bleifuß, Tobias, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, and Divesh Srivastava. Structured Object Matching across Web Page Revisions. In IEEE International Conference on Data Engineering (ICDE), pages 1284–1295, 2021.
     
  • Natural Key Discovery in ... - Download
    [5]Bornemann, Leon, Tobias Bleifuß, Dmitri V. Kalashnikov, Felix Naumann, and Divesh Srivastava. Natural Key Discovery in Wikipedia Tables. In Proceedings of The World Wide Web Conference (WWW), pages 2789–2795, 2020.
     
  • DBChEx: Interactive Explo... - Download
    [6]Bleifuß, Tobias, Leon Bornemann, Dmitri V. Kalashnikov, Felix Naumann, and Divesh Srivastava. DBChEx: Interactive Exploration of Data and Schema Change. In Proceedings of the Conference on Innovative Data Systems Research (CIDR), 2019.
     
  • Exploring Change - A New ... - Download
    [7]Bleifuß, Tobias, Leon Bornemann, Theodore Johnson, Dmitri V. Kalashnikov, Felix Naumann, and Divesh Srivastava. Exploring Change - A New Dimension of Data Analytics. Proceedings of the VLDB Endowment (PVLDB). 12(2):85–98, 2018.
     
  • [8]Bornemann, Leon, Tobias Bleifuß, Dmitri Kalashnikov, Felix Naumann, and Divesh Srivastava. Data Change Exploration using Time Series Clustering. Datenbank-Spektrum. 18(2):1–9, 2018. DOI:https://doi.org/10.1007/s13222-018-0285-x.
     
  • Enabling Change Explorati... - Download
    [9]Bleifuß, Tobias, Theodore Johnson, Dmitri V. Kalashnikov, Felix Naumann, Vladislav Shkapenyuk, and Divesh Srivastava. Enabling Change Exploration (Vision). In Proceedings of the Fourth International Workshop on Exploratory Search in Databases and the Web (ExploreDB), pages 1–3, 2017.
     

Student projects