Prof. Dr. Felix Naumann

What is Metacrate?

In few words, Metacrate is a database for data profiles. Data management applications can use it as a library to store, organize, and analyze query profiles in many different ways. Technically, Metacrate consists of a logical data model that can be hosted on several storage backends, an analytics engine to query and integrate data profiles, and a library of common data management algorithms. Make sure to also have a look at our data profiling tool Metanome, whose profiling results can be easily imported into Metacrate to get started.

Getting started

Metacrate is hosted on GitHub as an open source project. You are free to use it as a library in your own projects. Or if you just want to play around with Metacrate, you can do so within a Jupyter notebook that runs the Jupyter-Scala kernel. In this setup, we also provide enhanced visualizations based on Plotly and d3 (see this screenshot). For further instructions, visit the repository. Also, stay tuned - soon, we will provide example notebooks here.


If you are facing trouble with Metacrate, we would be happy if you filed a GitHub issue. For other questions or feedback, please contact Sebastian Kruse.