Our group includes PostDocs, PhD students, and student assistants, and is headed by Prof. Felix Naumann. If you are interested in joining our team, please contact Felix Naumann.

For bachelor students we offer German lectures on database systems in addition to paper- or project-oriented seminars. Within a one-year bachelor project, students finalize their studies in cooperation with external partners. For master students we offer courses on information integration, data profiling, and information retrieval enhanced by specialized seminars, master projects and we advise master theses.

Most of our research is conducted in the context of larger research projects, in collaboration across students, across groups, and across universities. We strive to make available most of our datasets and source code.

Please do not hesitate to reach out directly to us, if you cannot find a paper, slides, or other research artifacts.

ProLOD++ Profiling and Mining Linked Open Data

ProLOD++ is a web-based profiling tool, which allows you to analyze Linked Open Data (LOD) and thus helps you to gain a deeper understanding of the underlying structure and semantics. You can try it out here or find the code on github (CC-BY-SA).

LOD is data published on the Web adhering to a set of design principles. There is a notable growth of such LOD sources, which provide a wealth of information. Usually, these data sets are very large (millions of facts) and often heterogeneous (e.g. have a loose structure or are poorly formatted, etc.). This heterogeneity causes potential data quality issues. ProLOD++ helps to identify these problems.

ProLOD++ is able to process arbitrary LOD sources by analyzing N-Triple files containing all information of a dataset. Currently, the access to this automated analysis is not publicly available, i.e., you cannot upload NT files to be analyzed. However, if you are interested in profiling a specific data set, feel free to contact us. Also, you are welcome to play with the data sources we already uploaded, e.g., DBpedia and LinkedMDB. Your feedback is appreciated.

Researchers

Anja Jentzsch (primary contact)
Felix Naumann
Christian Dullweber
Pierpaolo Troiano (University of Modena and Reggio Emilia)
Toni Grütze
Ziawasch Abedjan (MIT)

Publications

Anja Jentzsch, Christian Dullweber, Pierpaolo Troiano, Felix Naumann. Exploring Linked Data Graph Structures. In Proceedings of Posters and Demos Session, ISWC 2015, Bethlehem, PA, USA, October 2015.

Ziawasch Abedjan, Toni Gruetze, Anja Jentzsch, Felix Naumann. Profiling and Mining RDF Data with ProLOD++.In Proceedings of the IEEE International Conference on Data Engineering (ICDE), Demo, Chicago, IL, 2014.

Christoph Böhm, Felix Naumann, Ziawasch Abedjan, Dandy Fenz, Toni Grütze, Daniel Hefenbrock, Matthias Pohl, David Sonnabend. Profiling linked open data with ProLOD. In Workshops Proceedings of the 26th International Conference on Data Engineering (ICDE), Long Beach, CA, pages 175-178, 2010.

Chair

Prof. Dr. Felix Naumann

Information Systems

E-Mail: felix.naumann(at)hpi.de

Assistant: Diana Stephan

Office: Campus II, House F, F-2.01
Tel.: +49 (0)331 5509-280
E-Mail: office-naumann(at)hpi.de

To visit us, please see these directions.

News

17.11.2025 | New book chapter about "Data Quality for Enterprise AI" published

We are excited to announce that our new book chapter "Data Quality for Enterprise AI" has just been published. > Go to article

01.11.2025 | Paper accepted at WOP@ISWC

We are excited to announce that our paper "Is SHACL Suitable for Data Quality Assessment?" was accepted at the WOP … > Go to article

29.09.2025 | Paper accepted at NeurIPS 2025

We are excited to announce that our paper "Learning Conditional Marked Event Sequences with Mixed Data Types" was … > Go to article

29.09.2025 | Paper accepted at SIGMOD 2026

We are excited to announce that our paper "Burr: A Benchmark for Ontology Learning from Relational Databases" was … > Go to article

09.07.2025 | Paper accepted in SIGMOD Record

We are excited to announce that our paper “Table Dissolution: Adding Salt To Your Data” was accepted at the Ninth … > Go to article

Project highlights

Metanome: Big Data Profiling

Metis: Data Quality Assessment

Janus: Change exploration

KITQAR: AI and Data Quality